Archives

These are unedited transcripts and may contain errors.

Plenary Session
15th May 2014
9 a.m.

CHAIR: Good morning. Welcome to the early morning session of the second day of RIPE 48. So it will be myself, Brian, and Will chairing the session this morning, and we are going to just very briefly remind you all that all of the talks can be rated through your RIPE NCC access account on the system and please do rate them because it's what the PC use to look at the content and see what's good and not good and prepare the programmes for future talks. Also, to mention again that there are two seats available for the PC, for the elections on Friday, so if you wish to nominate yourself or somebody else, please mail pc [at] ripe [dot] net.

The first talk this morning is from Leon on better crypto.

LEON AARON KAPLAN: So, Hi. Good morning, thanks that so many of you came because this talk is going to be about like cryptography and it's based on mathematics, the one thing I hate by mathematics is doing it in the morning and that's really, so without coffee it doesn't work, usually you need cups of coffee to actually turn on your brain and start producing formulation.

But the good news is this is ?? I'm not going to do any mathematics here, this is about a project that has started in the summer and early fall of 2013, after... well, this happened, and basically that was very good motivation for us, every crisis is a great motivation to do something good. And, of course, you know, I don't want to do any finger?pointing here, it was the obvious thing, we do have a general problem on the Internet with keeping things secret when we want to keep them secret or private when we want to keep our customer data private, etc.. It's not just this agency but other agencies as well.

So, our belief for doing this document was don't give them anything for free. You, as operators, or as service providers or website hosts, whatever, you own that space. It's your home. You need to protect things and we actually do have something that prospects us, which is cryptography, as long as we can trust the mathematics, there is still some hope. So, just, you know, in case you heard enough already, here the quick links upfront, I'm going to send them again. Basically, this document is totally open source, you can get it here on the GIT repository. It's mirrored on GIT up. There is the mailing list, we have a chat and there is a web page.

Okay, so, why is this relevant for you? Well, most of you are probably operators, so, we actually, after we started writing this document, we actually read, all of us read that even sis admins seem to be targets because they hold the keys to the kingdom, so, basically, you know, it's a matter now of protecting everything which could potentially give access, unauthorised access to some of your crown jewels, or your network or your routers, or whatever. However, the problem is that good cryptography is not so easy to actually really do. And it's very easy to make mistakes. So, we do know that crypto works, some of it, maybe not all of it, but we also know that it's really tricky, and there are lots of pitfalls. What you do usually, you have competitions or you have multiple people reviewing it, so what we did, actually ?? this group of people, just simply started that project, is we got together, we brought all the information that we could find about good common best practices about crypto settings, mostly SSL, into one place, and that's basically the better crypto document. And it's open for new suggestions and we see it as an open review process, and that's actually ?? I'm going to come again to that. That's actually one of the core points, open reviews of crypto recommendations.

So, here is the idea. We need to do something against the Cryptocalypse that happened. We know there are different problems with all kind of crypto algorithms, settings and pitfalls, etc., but we can find some good practices for SSL, SSH, PGP, etc., as long as you know, as long as, basically, we don't have any proof that something else is broken there. However, most people don't, you know, deal with the, you know, crypto on an everyday basis, so we need easy copy and pastable settings for the average admin user. That's the key here. Keep it short and have many people review it, make it open source. That's the basic idea of our document.

So, what do we have so far? A big disclaimer. A chapter on different methods. How we develop the paper? Some, a chapter on ECC, a chapter on key lengths, a chapter on cipher suites, which cipher settings we recommend, and why. And then these cipher suites are being copied into a different configure stances for a different service, like Postfix, mail service, Apache, etc., etc.. Here is a chapter on tools, what you can use to test. And then, of course, lots of references and links.

So, how did we do that? We got together, we collected the best information that we could find. We intentionally have ongoing, permanent public review, and that's also why I'm speaking to you, because I would encourage you to review this document and if you find any bug, please shout. On the mailing list or on the chat channel, whatever. And I think that's actually the only thing that really helps us at the moment, because you know, with the documents we heard so many things that happened over time and so many beliefs now, IT systems broke down and just recently we read that you know, servers shipped to different places and before they got delivered they were opened and something was installed, etc.. So this is a big crisis in the belief of our global IT systems and networks. So the only thing that I can ?? could come up with, is, you know, at least we could find some recommendations and have people review them. The more eyes review, the better. The higher the chance that this is actually correct.
It's like physics.

So, some general remarks on crypto and especially start with ECC, because that's one of the very efficient ways to do crypto at the moment. If you look at the current state of ECC, there are lots of NIST curves. However, also NIST got some critique in the last year, because there was this you know, these parameters of the NIST curves, P?256. There are voices like, especially Dan, saying, we don't know where these parameters came from, like there is nothing up my sleeves argument, can't see nothing up my sleeves, but we don't know where these parameters came from. So, we do know that we might have to change these curves for ECC at some point so we need to stay flexible.

However, most applications need the NIST curves at the moment. There may be some exceptions, I think Google and Chrome are going to different directions with ChaCha, etc., going into different crypto areas, however, this is a good example that we need to be able to change quickly in case something happens. So, algorithm agility.

Key lengths: Very important thing, of course. In our paper, we follow the ECRYPT?II recommendations, which is a publication by a whole bunch of great cryptographers, mostly from Europe. And ECRYPT?II says RSA, at least 3248 bits, at least, probably the save recommendation is 4k, ECC also more than 256 bits. Except for special purposes where the time that you have is very short. And also we had a question, if ASs is actually still a good thing so we went and asked the original authors of AES and we got a very interesting answer by Vincent Ryman, he basically said you know, yeah, there might be some issues theoretically but it's like wearing a helmet in my car, I don't usually do it. And I really like that quote. So, actually, if you look at it, there are some more theoretical crypto attacks on AES 256 than on 128.

Another good tool that I can recommend and we figured out while writing this paper is there is a page called keylength.com. Where you can specify the key size here, how many bits you want to have, and then it lists basically the different papers which make estimates on how long this key length will last until 2020, or 2030, so you can have a comparison chart here and basically you can find your own choices. That was a very helpful tool for us.

So, I'm sure most of you know perfect forward secrecy, basically the idea is that you change the secret keys quite often, because what happened with laugh a bit, the hoster, e?mail hoster of Edward snoweden is basically if you have one single RSA key encrypting your SSL session, and that secret SSL key gets ceased at some point and somebody else is possible to sniff all the traffic and capture all the traffic, which is doable, then they essentially can decrypt it after the fact. So, forward secrecy helps against that, we have in our paper, we basically have forward secrecy by default as a setting. And that's basically [Diffy Heman] ephemeral. However there are some problems here, it is more expensive to do, crypto?wise.

There is an alternative, of course you do lots of new private keys, if you can do that on a regular basis is also possible.

Okay, ramdon number generators. Very important fact and most of the time it's overlooked. There is this great paper by Nadia Heninger and Lenstra comparing a scan of the Internet, sparing the host keys and there is a huge, space in embedded devices, there is a huge number of host keys which have repeated keys or on VMs, so if you operate virtual machines, make sure that the private keys don't get copied from VM to VM but they are actually generated again, because there are lots of them which are just copied, and if you have lots of copies of very similar keys you can actually find out the common factors through a mathematical trick.

So, that's pretty important as well. And many embedded devices start with the first of January 1970, when they start up, the time source, and as a time starting point, and then of course they generate keys, the first SSH keys or something like that, and of course their entropy is really bad at that moment. The other thing is like ?? we heard yesterday from this platform yesterday, many embedded devices don't do number random generation properly. I hope this one does, I have great hopes that it does because it's open hardware. However, for example, MIP space systems, the get cycles call returns constant zero, so that's really bad for number generators.

And also, of course, they are not only bad embedded random number generators in embedded devices, but they are also intentionally of course as we know tweaked or back door number random generators, we don't any anything about the Intel RNG, inside the Intel CPU. How do you prove it if there is. And so therefore in our paper, we did some research and we found some nice tools which support increasing the entropy pool and there is a demon for that algorithm, have GE, which mixes all kind of different entropy and feeds that into the entropy pool.

So ?? and there is good news from heartbeat, we need to regenerate some keys now. So...

Attacks. I'm going to skip some attacks because it's a longer talk than I have time here, but I think for reading the paper and understanding why we came to certain conclusions, it's important to understand the attacks. The beast attack is well documented, the crime attack is well documented. It pays off if you want to do your own cipher suites to first understand these attacks.

Let's going to the cipher suites. So, basically in our discussion about writing this document, we came up with two variants, A and B. Variant A is basically the really hardened setting that you could use for internal system to system communication, why you don't depend on any other users or anybody else, and you can have your own CAA and whatever. It doesn't matter. Variant B is the more compatible version with weaker settings, but it's really a trade?off between compatibility and the clients that you want to support, and the strengths of the keys and the key lengths etc..

So, some general rules that we applied, we disable SSL Version 2, version 3, TLS 1.0 or better, preferably 1.2. TLS compression is disabled and we basically enable H S, it S, which is the HTTP strict transport security setting, the header which says only speak to me via HTPS on the web server.

So this is blah blah blah, and that expands this cipher string expands to these list of supported settings, which is basically AES GCM, AES 256, SHA 2, AEAD, etc., and you can test that setting if you have basically an internal system where just systems talk to each other and you don't depend on anybody else.

However, it excludes many irregular clients. Variant B is the more recommended one. Here, there is a special thing, we do support Camellia, we were asked like why do we actually support Camellia, again because of this algorithmic agility argument, what happens if AES has been attacked successfully and we don't know about it yet? Then it's good to have. So basically what you see here is the order of the cipher string says, in which order a client should try the different settings.

Okay. That's the compatibility list. Basically all of the clients are supported except for Internet Explorer and XP, but that's end of life anyway. I know many people still use it, but that was our choice basically, and Java 6. Everything else is supported and it gives you a great rating on ssllabs.com.

We have a chapter on choosing our own cipher suites, so in case you don't trust our settings, it's a good right, we just explained how we came to our settings and you might want to come up with your own conclusions and maybe follow some of this advice or not follow some of this advice, but at least we documented why we came to our conclusions.

So, there is a chapter on practical settings, where all these cipher strings were applied to different servers, so Apache, NGINX, Internet information server, mail servers, database servers, VPNs, etc., etc., and if you have ?? if you are a vendor and if you have a product which you know should be in that document and just come up to me afterwards or contact us afterwards and we'll find the right settings which are, according to that document and the right configure snippet, that people can use and copy and paste, that's the goal of this chapter. Lots of copy and paste settings.

What we are still missing is something about MS exchange server, SI P and RD P, we want to transform everything from PDF into HTML. We need a configure generator; let's say I have Apache and I want to support this and that client, give me the code snippet for the SSL settings.

Okay, this is how it looks. That's what we offer in the document. Here the settings. You can copy and paste it. If you don't care, if you care, you get all the lines of argumentation in the papers. If you care, please feel free to change your settings yourself. Here some remarks, why we disabled certain things, not RC4 for example, and here, this is a fallback solution for SHA1 if it's needed.

Okay. So, how do you test these settings now? That's the interesting thing. There are a couple of really nice tools out there for testing your Apache, your web service, your mail service, and the first one is of course the open SSL command, the SSL command. The second tool is the ssllabs.com, which is a great tool which I would like to briefly show, if possible. So, basically, I couldn't ?? sorry, I had to try this here now. You can try any web page, and you will actually get some scoring here, and people just love, you know, yellow stars or greys or whatever. This is a great example of gamification, if you work that to your sys admins, that URL, in case they didn't know it yet, they are going to play around with the settings and maybe the documents and play around until you get an A here, that's just really the way to do it, that's really nice. Our settings basically, produce a nice A. And also, this page, by ssllabs.com, it's a great page, it has also lots of recommendations on what you could further improve, for example, here this one could improve forward, by adding forward secrecy. Lots of information here. Very, very practical page.

So, the same exists for Jabber and for other servers, of course. There is a tool calls SSL Scan and SSL LYZE. These are all great tools. That's a nice trick. Gamification works, give them to your sys admins. Let them play around with it. You'll end up with really nice settings.

That's an example of the SSL Scan. So... basically, we have currently, this is still a work in progress, the whole document. We have a solid basis with two variants. We presented the public draft at the CCC congress, we still know some weak points in the document. I would love to hear some for feedback and especially reviews and if we made mistakes, please tell us because that's the core of the documents, the reviews, and I know we still have to convert it to HTML.

How to participate. I said we have a GIT repository, we have our own Git repository and it's geared on GIT up. The good way to have a section is to just fork the GIT up version, send us a pool request, and be ready that really a couple of people are going to review it because that's the rule of the game here, review. The GIT Repo is world readable. Everything is world readable. Everything is under a creative comment licence. I think we didn't even do a non?commercial ?? so it's okay to actually include it into products if you want to. We have a mailing list. Review us, review us, that's basically it from my side.

Are there any questions?

AUDIENCE SPEAKER: Richard Barnes. First of all, this is a fantastic project, I am really glad you are doing this, I think it is something that the Internet will benefit from. A couple of thoughts on other things that might be interesting to look at. Format?wise, I think the GIT up mark down could be a good transition format because it's super simple to right it and it translates easily, I think that could be a cool thing to look at. One of the protocols I noticed missing from your list of things we have configures for was XMPP and you had the XMPP a little later on but this community that is been really active in trying to improve security configures and standardising the security constraints across the whole XMPP networks. If you need contacts I'm glad to put new contact because I think we'd be eager to help.

In terms of technologieses to recommend. One of the things you might look at at is key pinning in http. I really like that distinction you made between internal stuff and public facing stuff. Key pinning has had a some trouble deploying on the open web because there is a risk you can break yourself that way if you pin to something that goes bad. It might be really useful for internal stuff to really lock down and make sure that you have the right stuff for your internal servers.

LEON AARON KAPLAN: A good recommendation. When we wrote the XMPP section in the document, I think the E Jabber D was still being patched a lot, and I should review that actually and see what we ended up. I think at that stage of writing the document, we had to still add some patches to the source code and we didn't want to document that at the moment I think, but it needs review, yeah.

CHAIR: Any further questions?

AUDIENCE SPEAKER: Matt. Is there a separate configuration for the XP Windows and the Internet exploring because like cannot use name base the SSL, that I cannot see anyway that we can drop these clients.

LEON AARON KAPLAN: Yes, in our document we mentioned basically the way to support XP and i.e. is by adding RC4. And you just add that to your cipher string and you're good. And you can play around with the ssllabs.com and test it, let ssllabs.com test it for you, but basically just add RC4, or actually remove the exclamation mark of not RC4 in our string and there you go. So, I understand that there are settings where you absolutely need to support Internet explorer, even though it's end of life and in many developing countries, China or, I can't say developing country, but in China and India you have a huge XP user base, and I do understand that. It's just that, you know, when we talked about this with the group of authors, we said, well, this is the better crypto document, so, we shouldn't recommend where we know there are problems but I understand you need to support that. So, yeah, it's easy to support it.

AUDIENCE SPEAKER: Matt from the RIPE NCC, I have a comment from Frederick, he is not speaking for any organisation. He says RSA greater than 3248 from keylength.com seems like an over interpretation, it's not saying to use the same key for 25 years.

LEON AARON KAPLAN: Yeah... I agree, basically, just remember that the authors of this paper are really a mix of people, sys admins as well as computer scientist, as well as cryptographers, what we stride to do was collect the existing knowledge, so, we do reference a lot to existing papers, so, in that case, it was ECRYPT?II, and that doesn't mean automatically that me personally I agree completely with what that means, I think we should change keys regularly. However, having written an internal document about PGB key transition inside of my organisation I have to tell you honestly that key transitions are a big mess, it's really really tricky. So if you say, okay, we need to exchange keys regularly, I agree. However, you need to make the next step in addition, and document how you are going to do that hand how you are going to, you know, keep all the signature chains if you have some kind of PKI etc., it's not so easy especially in a web of trust settings, especially by PGP, so, yeah... the whole topic is not easy unfortunately.

CHAIR: Thank you very much, Leon. I don't think we have got any more questions.

(Applause)

Just a reminder to please rate the presentations, there is a facility to rate the presentations, it's really useful information for us in the Programme Committee. Next up is Job Snijders, he is going to talk about some work he's been carrying on on selective black hole for DDos protection.

JOB SNIJDERS: Thank you. Good morning everybody. My name is Job Snijders, I do stuff for the Internet and today I'd like to talk to you about selective blackholing. First I'd like to run a small inventory. Who here offers IP transit to their customers? Please raise your hands. And who you offers a black hole community so customers can instruct you to discard traffic? Today, I want to talk about that concept but extend it and make it more usable, and I'll explain to you how it works, how you as an end user of an IP transit provider can use it and an IP transit provider could implement this.

Selective blackholing is that you instruct your Internet source provider to only discard packets coming towards your IP addresses under certain conditions. Another interpretation of selective blackholing is that it's a gravity well that light cannot escape except the colour purple.

Selective blackholing matters because most often in my observation, content providers are the victims are DDos attacks. Game servers are attacked or flight machinery, web shops. It's rarely the eyeballs themselves. And what I have noticed is that post content has a locality property. It's most significant the closer it is to the eyeball rather than further away. In other words, a Polish web shop owner cares most by Polish eyeballs and less about American, Dutch or Belgian eyeballs and we can use this property to protect such web shops in a very, very cheap way.

The issue of classic blackholing as all of you are offering these days is that it's an all or nothing proposition. The moment you insert that black hole, any and all traffic is gone, including the revenue that that IP address maybe was generating, ifs web shop was off line, nobody can click and spend money on it. So you throw away the baby with the bath water.

Scope, therefore, is relevant. If you could limit within which constraints or which region traffic is black holed and in which region traffic works as normal, you could view it as having a small community. Inside the community people can still sell each other bread or drink a beer but outside that small community, the romance were, and they were congesting maybe other networks upstreams. There is a certain ?? there is life if you are under a DDos if you know where the DDos attack is coming from and what your visitors ?? where your visitors are located. If there is a difference between the sources of the DDos and the location of your eyeballs, then you are in a luxurious position to do something about it.

Please note this is damage control, not mitigation, there is no magic in that regard. The it is better to receive a small percentage of the DDos attack than all of it. Maybe I have a 10g uplink, I can get a few gigabits of unbound traffic, but not 30, 50, 60 gigabits.

Assertion 2 is that it's better to remain partially reachable than no reachability at all. And that goes back to what I showed with classic blackholing, where it's an all or nothing proposition. This is a little bit of both propositions.

What does selective blackholing mean in practice? What I have done is, I have implemented selective blackholing in one of the global IP transit providers on this planet, and, in Amsterdam, I connected myself as a customer to this network. I instructed the service providers network to discard packets coming from further than one thousand kilometres away on their backbone. Then I used the beautiful RIPE Atlas system to test reachability and what we observe here is that with the thousand kilometre radius, within that radius, the Atlas probes could reach the target indicating latency, but the American probes, more than 1,000 kilometres away could not reach the IP destination.

Please realise that the thousand kilometres that this community symbolises, it's 1,000 kilometres from one side of the sphere to the other side of the sphere, based on a haversine calculation so it doesn't take into account actual fibre paths or actual path of the data packets but it's just a rough measure, so to speak.

Another feature you could implement is black hole outside this country, and in this case I instruct the service provider to discard packets that it receives on routers outside the Netherlands. And what we see here is that the scope has become even more narrow. It's very focused around that small beautiful little country with tall people. However, why on earth are we seeing this, some Thai and African networks being able to reach that network? These red dots are what remote peering does for the Internet. They completely obfuscate any routing information, they hide actual distances, so stop doing remote peering. It's bad.

Anyway... if we zoom in on the Netherlands, we see that within the Netherlands there was perfect reachability. 100% of the RIPE Atlas probes could reach the destination IP. So I have to conclude that selective blackholing when I instruct that particular service provider to discard packets outside the Netherlands, that I almost, as a guarantee, can still access eyeballs inside the Netherlands, plus a little bit of people that connect from far away.

And this, as you you might think. If you have a DDos attack that is perfectly distributed across the globe, will significantly reduce the amount of traffic you still receive while the DDos is going on.

Let's talk about how you could implement this as a service provider. I encourage all service providers here to implement this and if you are a customer request it from your service provider because this methodology can save you a lot of money.

We'll talk about four features. Feature one is discard packets coming from further than 1,000 kilometres away on the backbone. 2,500 killometres outside this country and outside this continent. There are many variants you can imagine, but the gist should be clearer after this presentation.

First we assign some integers to devices. On the far left side I have seven X routers, they are spread over three continents, I have assigned each X router a unique, say, metro or city ID. I have put the GPS coordinates in the database. I have assigned Asia the integer 3, Europe is 3, America is 1 and each of these devices is located in a country and for that I use the ISO standard, blah blah blah, because it's a convenient number. These integers are important because we can recycle them in our routing policy.

On each device for this advanced feature, we'll have a little bit of device specific configuration. To implement this routing policy you'll have quite a bunch of text if you are on IOS or Brocade and if you have on Juniper, it's a very small configuration, but each configuration will need something that identifies the device. So, we assigned a device in New York, the special city code plus 10,000. We assigned the country ID, the ISO code and 1,000 instead of 1 to identify the continent. London being in Europe, has 2,000 and the UK country code, you can review this later.

And before we dive into the actual route maps I want to show you the process flow. On the far right side, there is a customer, myself, and what I do is I announce to the PE of my service provider a prefix with a certain BGP community that the service provider told me beforehand. When the service provider receives this BGP update, it goes into the customer phasing route map and there it will be rewritten to something that is meaningful within the administrative domain of the service provider. In this case, purely as an example, the scope: 2000 is added, which means Europe. The instruction is discard packets outside this continent and the way the routing policy is implemented is by adding a certain BGP community, the rest of the network can act upon then. The update is propagated with that extra BGP community to another PE. The other PE can verify because of that metadata, whether it's within the scope or outside the scope. If it's in the scope, it should not black hole; it should just forward traffic. But as it propagates to the United States, which is outside this continent, the New York device tries to match the update with its policy and it realises, I'm outside the scope, I will black hole this stuff.

How did the New York device know to black hole? If we look at an iBGP route map, I can show you how that would work. First of all, we know route a certain magic IP address so we can use that as a next hop to get rid of packets. The red instance of the routed map is a classic black hole instance. If it matches the community: 666, the next hop should be R bit bucket and the route map will stop passing there because if this guy matches the permit applies and there is no further parsing as there is no continuous statement. The second instance permit 200 matches the selective black hole communities that were communicated to the customer. Each customer should be able to expect that it can use the same communities in every country. You don't want to give the customer a community this will discard in Germany, this will discard in Poland, this will discard in Switzerland, or you give them a single BGP community which says it will discard outside this country or outside this scope. This way customers have only to remember four our five BGP communities instead of complex schemes. So in term 200 we match whether a selective blackholing community is attached to the update and if there is something attached, we will continue to entry 1100, which I'll show you in a few minutes. But if there is no match, the third term applies which is permit thousand and given that there is no match statement at all, it will accept any prefix and it will be business as usual as it is probably today in your network.

The continue 1100 jumps to this place, and if there is a match for this metro or this country or this continent, because, remember, we previously assigned per?device specific configuration, the prefix is just accepted. There are no set commands, there is no further instructions, it stops parsing. But if there is no match, if it apparently falls outside the scope, term 1101 is parsed, because there is a logical order between these two route map statements, and the prefix is no routed.

To accommodate this iBGP routing policy, we need to rewrite any updates that come in from the customer and the next ?? this slide and the next slide will show you what an example customer phasing route map could be. I have chosen the IOS implementation or Brocade implication because it's harder, on Junos this would be a few lines and we would have nothing to discuss. But in this case, you'll see the jumping back and forth in the route map.

In the red term, we implement classic blackholing, or a nothing proposition S everybody expects that so we should deliver that to the customer. We match whether the customer is allowed to set the prefix. If it contains :666, the next hop should be R bit buckets and we stop parsing the route map because there was a match.

If there is no match with the red term, the blue term could be applicable. We verify is customer A allowed to send these prefixes to us? We match whether one of the outside 1,000 kilometres, outside 2,500 kilometres, outside this country, communities match. If yes, we will jump to the next slide; if no, the green term is parsed and this one is just regular BGP without any relevant features. We verify if the customer is allowed to send you the prefix. If yes, you assign some local preference and that was end of the parsing.

Line 30, the deny 500 statement is interesting. That's our catch?all. If a prefix is announced to us that doesn't match any of these classes, the prefix will be dropped. Anything that arrives at this route map statement is probably an illegal prefix that the customer is not allowed to announce or maybe it's some combination of features that are not possible in the network.

If we focus on continue 600, we will jump to the next slide. What we do here is, and this is still the customer phasing route map, is we match what the customer wanted to accomplish. Maybe outside this country discard and then we add a specific community that identifies what the scope of that country might be. In this case, 840, is either the Netherlands or the United States. I confuse those two. Anyway, this is computer?generated stuff, you don't want to type these in as human beings because those rewrite statements are becoming complex.

So just to remind us what we are doing. On the far side, there is a customer. He instructs with the single BGP community what it is he wants to accomplish. The customer phasing route map that I showed just before will rewrite that particular community to something that indicates the community feature, plus a scope. And then as it propagates through the network, every inbound iBGP route map verifies am I inside the scope or outside the scope? If I'm inside, I'll do nothing; if I'm outside, I'll blackhole the packets.

But how do you generate those iBGP community strings that we saw in the customer phasing route map because this is not something you want to write down with a pen and paper. For that, you can use a simple Python strip that will generate all of this for you. If you download this script and it's public domain, it's up to me, you'll notice that it has a small C and a B, much like the table I have shown a few slides back, that contains the integers per device, what country, what GPS coordinates, what continents, and this script can generate the solutions that are needed, the solutions to which you would rewrite the customer updates.

In the network that I have demonstrated with the seven X routers spread over three continents, if you want to accomplish a scope of 1,000 kilometres around the London X router, the solution would be the London city plus London ?? I guess that's the Netherlands then. I should have written this out. The point is, if you visualise all the BGP communities as a table, the software searches how to group routers according to the constraints that might be applicable such as a thousand kilometres, 2,500 kilometres. The scopes like this country and this continent are very easy because you can just statistically assign whatever integer was assigned to that country. The BGP feature where you limit the scope in terms of kilometres on the backbone, that's where you need to figure out within this radius of 1,000 kilometres, which devices fall within it and rewrite towards that.

What I recommend, if you would implement this, is on an offline machine you calculate what the route maps should be. You then verify with maybe your installation or whatever toolset you use, whether the off line calculated route maps are the same as on the device and if they are not the same you push an update to the device, so every time you add a new POP, the scope calculation would be slightly different, so you have to run this once in a while. And at the end of this process, you have happy customers that suddenly have no longer afraid to be DDos because they can just shut out traffic that they are not interested in.

Let's talk about some further considerations. You have to automate this if you want to do the 1,000? and 2,500?kilometre thing. I think the geographic kilometre based distance feature is interesting because not in all countries in the world are of equal size. Some are bigger than others. But for content, maybe it is relevant.

You have to use a central management database of source where you store integers that are applicable to devices. The nice thing about selective blackholing is that it actually requires very little device specific configuration compared to the complexity of this routing policy. As I mentioned earlier, in the iBGP route map, that can make cookie stamped across your devices and the only thing that differs is the three lines that match the named community list. You can deploy this on any vendor. I have deployed it on a very ?? let's not talk about my traumatising past ?? the point is you can do this on all equipment, new equipment, it doesn't matter, it's something you can even run on your telecommunications ago instrument calculator. This feature doesn't cost cost you any extra money. It requires a little bit of time of scripting and that should be very easy to pitch within your organisation, because your management probably likes spending no money but getting more. And please note that customers probably have not asked you about this this feature because they don't know it exists. Go out there, educate your customers, show them there is a different methodology than global black hole, you can do something better, you can limit the scope of the damage.

And last but not least, this method saves both you as a service provider as the customer money. It saves the service provider money because it does not have to transport bits all across the world to be finally dropped into some congested pipe. It doesn't make sense to transport 50 gigabits worth a customer that is connected with only gigabits. This methodology, it can still build the customer for the traffic that happens within the scope that was byte listed and you don't even have to transport traffic over your backbone for when the customer is not interested in. So I think it's a win?win proposition.

We have I think two minutes left. It's time for questions.

CHAIR: We have slightly more than two minutes, thankfully. But are there any questions?

AUDIENCE SPEAKER: Hi, so, what's the downside other than we have just built a national Internet? So, it seems like you're creating by default for DDos connectivity problems that will be very hard to debug for people other than you. So, do you have an estimate of sort of the cost that you have externalised?

JOB SNIJDERS: I recommend that people only use this when under duress, so if gigabits is coming into your pipe, then you'll have it script that automatically announces the prefix with this particular scope.

AUDIENCE SPEAKER: That wasn't clear because you said your customers will be unafraid to DDos because you have it on.

JOB SNIJDERS: They have this as a tool in their tool box to deal with issues. So, I would highly recommend against doing this on a permanent basis unless you are crazy.

AUDIENCE SPEAKER: Nice presentation. Definitely interesting topic. Actually now, we have a Dutch network and we use different Internet providers for our international traffic. So, for us, doing this, it's, you know, all our routers are actually within the scope of 250 kilometres. How do you recommend, you know, we educate our transit providers to actually start implementing this and, because I definitely see a benefit of this.

JOB SNIJDERS: Thank you for bringing this up. It makes most sense to implement a scheme like this if your backbone stretches further than 1,000 kilometres. If it does not, just talk to your upstreams and get them to do it. The nice thing is, you only need one upstream to support this, because, when under duress, you can just announce a more specific /24 to that particular upstream that supports it and then within a /24 you announce the IP address that is under attack and should have a selective black hole mechanism applied. I just, you know, as with any request for ?? with any RFP, you go to your upstream, you point them to this presentation. You tell them guys I will only accept my contract if you implement this. But maybe pitch that it saves them money as well.

AUDIENCE SPEAKER: Definitely interested.

AUDIENCE SPEAKER: Hello. I have a few years a very similar idea but with a little change. What to make route server by an out disc we can just for example RIPE, and where I need to deliver the black hole to the routers in America or so, they understand to black hole to the route server operated by the authority, and this route server will be open to BGP session also to participating networks, so I will be able to send the black hole directly to a remote network that is not under my control.

JOB SNIJDERS: I have heard of multiple initiatives to do some form of cooperative blackholing between service providers, but before we dive into policies such as this, I think the first issue that should be solved is how do you authorise whether somebody is allowed to send that black hole. And that has not been solved yet. IRR data has not got enough and the RIPE region is the only one that has a feedback loop. But I'd be happy to talk about this off line after the presentation.

AUDIENCE SPEAKER: Rudiger Volk. Well, okay, two remarks. One is it would be nice to actually have at least a short remark that blackholing always should be done with very serious ought authentication which is also a comment to the previous question.

JOB SNIJDERS: We agree.

RUDIGER VOLK: The other thing is, following the remarks from Zurich, kind of in the all global operator, actually you have to expect that the help desk dealing with requests why are ?? well, okay, we feel we are being ?? some of our traffic is being blackholed, the help desk will have a much harder job if blackholing policies injected by third parties is differentiated in all the places and there will be ?? there will be blackholing applied to traffic paths that are counter?intuitive, while this is a neat idea for defining extended signalling and doing blackholing, well, okay, the service provider side actually is not just nice. There are problems with it.

JOB SNIJDERS: But will you implement it, Rudiger?

RUDIGER VOLK: Kind of, I guess the cycles of my SDN implementation which has been around for many years now, is actually pretty long and I'm not quite sure into which cycle it goes.

AUDIENCE SPEAKER: Ronan Mullally from Akami. To link this back to Geoff Huston's talk yesterday. How do you see it working in environment where everybody rushes to peer in the big peering points? How do you see this working in an environment where everybody comes to big peering points to peer as close as they can rather than having a more geographic distribution?

JOB SNIJDERS: This methodology works by the grace of the fact that most networks today apply hot potato routing. You want to get rid of packets as soon as you can and drop them, as soon as you can on a network that is closer to the end point. And it's precisely because everyone wants to connect with each other we have that dense Internet that this works. Because a service provider that has a customer that is sending the DDos will try and give the packets inside the country where it receives them. And if the end user is in a different country, then this methodology applies because in both countries there is an edge router, one edge router is instructed to drop packets, the other edge router will not drop them. Precisely because the Internet is dense and interconnected and people peer with each other, this works. This would not work if you only peer with each other on one continent but not on the other continent because then the path stretch would circumvent the selective blackholing. So keep peering. Peer with each other. Just don't do it remotely.

AUDIENCE SPEAKER: I have got the wrong end of the stick, say, for example, I peer with, to pick a random network, China Telecom in London and I have a community which says only announce this prefix not blackholed in London, do I need to go and change all my route policies to go well this is China Telecom, so this isn't actually in scope for this or...

JOB SNIJDERS: But, as Akamai, you would implement this, is that ??

AUDIENCE SPEAKER: This is, Akamai will acquire ?? this is something I have done on a more manual basis, but if ??

CHAIR: We're quite tight on time. So if you want to have a conversation, we have time for one more question, but that's it.

AUDIENCE SPEAKER: Hello, Andreas Polyrakis from GRNET. So, there is another modern technique for DDos allocation, where you basically carry fibre filters over BGP. I don't know if you have an experience with FlowSpec, but ?? I wonder if these techniques, if you can comment if these techniques can be applied to FlowSpec as well, because you can use BGP communities there as well.

JOB SNIJDERS: I think this applies to any type of BGP?based signalling, because you just set apart from certain set of edge routers and apply a match or deny. So this could be used with FlowSpec, but what FlowSpec might ?? today as an issue have is that there is no strong guarantee that the customers actually allowed to guide traffic from VRF 1 to VRF 2. So the authentication layer as of today is something I'm a little bit worried about, while with this mechanism you can leverage whatever it is you do today anyway for classic blackholing. So I think the barrier is lower, but, with FlowSpec you would be able to very, very selectively either allow or deny traffic based on layer 4 parameters, so... maybe next year.

CHAIR: Sorry and ??

AUDIENCE SPEAKER: Even today, there is no issue to do the FlowSpec with the E scope. Authentication is the same problem, right? I mean, the authentication of the black hole instruction that somebody who gives this instruction is authorised to do that because it's my network, it's my server, that's a challenge, but it's a challenge for the classic blackholing or FlowSpec.

CHAIR: Sorry, we really need to finish this at this point. So very quickly if you have a response job ??

JOB SNIJDERS: You can do VRF jumping with FlowSpec and that's outside the scope of IRR?based authenticated data. I guess this is a separate discussion. Thank you for your time. If you have any questions, approach me, e?mail me, I'll be happy to help you implement this. I have Junos configuration as well and I hope you enjoyed yourself.

(Applause)

CHAIR: So the last talk of the session this morning is Christian Rossow.

CHRISTIAN ROSSOW: My name is Christian Rossow, and I am an academic researcher mainly at the University of Amsterdam and you may have heard my name because I popped up a research report like half a year ago by basically summarises protocols that you can use for amplification attacks, this is exactly what this talk been about.

Just a brief reminder, what is amplification attack? It's basically a DDos attack in which you do not really the attacker on the left?hand side immediately attack the victim on the right?hand side but instead you abuse so?called amplifiers which are typically public servers of certainly protocols like for example DNS which you can send requests to and that would reflect the traffic to the victim. So the attacker uses IP spoofing to spoof the IP address of the requester. In the most well known case attacker abuse the ?? what's open DNS resolvers and these resolvers will then respond to the victim with much larger packet than the response. So, in essence, the attack mainly abuses the fact that the amplifier not only reflect the traffic but also amplify the traffic.

There were a couple of huge amplification?based attacks in the past. One was against Spamhaus, which was in the scale of 300 gigabit per second and it was abusing the DNS protocol and basically this attack motivated me to dive into the general threat of amplification attacks. So, in principle, what other protocols then DNS can actually abuse for such kind of attacks.

After the research, there was another attack which is much larger in scale, it was a 400 gigabit attack against a French hosting provider which was abusing another protocol for amplification which in this case was NTP, like for my perspective a far worse protocol we have ever seen in this context.

In this talk I will talk about, basically, two sides. The first part of the talk will about the attacker point of view. So from the point of the attacker what protocols can be abused, what protocols are really effective and efficient in terms of amplification. And the second part of the talk will be the defensive side. What can we, as a community, do to really counter the threat of amplification attacks.

First thing an attacker needs to do is really iterate the protocols that you can abuse for amplification. And in my work I found 14 protocols that are vulnerable to amplification attacks and this is not an exhaustive list, I have to emphasise this. There is more protocols than these 14 but these are the 14 that I think from the viewpoint of the attacker very valuable. I grouped these 14 protocols into five categories which basically summarise what they do. The first one are network services and I am sure all of you know these protocols.

One is DNS, as I just mentioned which is amplification, with huge factors. Then we have SNMP. They have NTP, which is by far the worst protocol ever seen in this context of we have NetBios and SSDP. Next we have two legacy protocols but before I give the names I would like to do a little quiz, I was a winner to do this research too young to do this because they were invented before I was invented. Who have you guys know the protocols quarter A or charging? For those of you that do not know two two protocols you should read these RCs. There is RC documents on these protocols which are one?liners. To give an example, like CharGen mentions, you listen on port 1 when you ?? you respond to UDP packet full with random characters back. From an amplification perspective this is deal. You send the UDP packet with one byte and you get 1,000 random bytes back. This is cool for amplification.

The third group that you looked at were P2P protocols. Like BitTorrent, and Kad and these also offer vulnerabilities that can abused for amplification attacks. Basically, there is the fact that these peers share the files or the file on chunks that they can offer.

The fourth group of protocols Quake 3 and Steam, the steam engine are basically game server engines that you can actually abuse for amplification attacks. So in these protocols you will send a request for server information and the server will send you a huge blob of text and the list of current players back which against gives you amplification. And the last fact I found three peer?to?peer botnets which can also be abused for amplification attacks, that you do not own or control these botnets but you can abuse them as normal amplification servers like really DNS base service servers.

As a matter of fact, most of these protocols are quite old. If you look at the times they were invented, there was certainly not the idea of the DNS server attacks and not evenly amplification attacks. So these protocols were largely designed without any security perspective with regard to amplification attacks. Which makes me not surprised that there are so many vulnerabilities out there in these protocols.

The second measure as an attack you are interested in is what of these 14 protocols is really effective for my attack and the first measure here is what is the actual factor of amplification when I abuse a certain protocol? And I measured this in two ways, one is the bandwidth amplification factor, this is not like rocket science, defining the number of bytes the victim receives by the number of bytes the received in the traffic. For example, if you can request like a DNS request with 50 bytes, and you will send a response like 1,000 byte lark, this is a factor of 20, so this is the basic computation. The other factor that I looked at it the packet amplification factor. Sometimes you are not really interested into like having the largest bandwidth base attack but you would like to amplify the number of packets. Again you then divide the number of packets seen at the victim by the number of packets you need to send to trigger this traffic and then you get the packet amplification factor. To give you another example in NTP you can abuse the monitoring list feature where you send one request like one single packet to a server and the server will send you 100 packets pack, this will give you a packet amplification factor of 100. To put this into number. I made this small graph by basically illustrates the bandwidth... and the Y axis you see the different protocols. Let me highlight a couple of examples. If you look at this graph one thing strikes out, which is the NTP amplification. So, in the experiments that I did, NTP offered amplification of over 4,500 and in theory, it's even higher, it's like 5,500 factor, which is really a large amplification and there the abuse really one specific monitoring feature of NTP which doesn't really need to be there, but it is enabled for many NTP servers and that is why you can abuse it this way. The second protocol that I found interesting is the P2P, like BitTorrent, you can see a ten?fold amplification. And also game servers like Steam gives you like fifteen?fold amplification which is like cool if you want to run and take all the other protocols are being solved by them.

You can see that there is a slight variations in the colour in the bars and this indicates this marked choice of the attacker, what amplification servers he is abusing. If you for example, for NTP just choose, you will only get a couple of hundred?fold amplification. If you choose the servers that sent the largest responds back you get this 45,000 amplification factor which is like the different colours in the bar chart. This is an important choice of the attacker not to abuse all of the servers you can get but really focus on the server that is give you the largest amplification effect.

Just to give you a random example: if, for example, look at the game servers, there is some servers that have larger responses because there is more players currently playing, so there is a list of players is longer and the server information is also longer, in this case the response is much larger than the response for an MT server.

Another important fact for the attacker is not looking at the amplification factor but also to understand how many amplification servers he can abuse. Because even if you have amplification factor of 5,000 but if only one server that doesn't help you. Another date in my research is I tried to numerate all the servers out there for these different protocols. Basically I used three different techniques, one was scanning. So I wrote a very fast Internet wide scanner which scans whole IPv4 in a couple of hours, this gives awe quick view on all the NTP servers for example out there. And you found one?and?a?half million NTP servers which have this particular feature and can be abused like with the amplification factor. Another technique that I use is crawling, so like, P2P networks like BitTorrent you do not have a single port that you can scan for the Internet. Scanning is not an option here but instead you can use the crawling technique, which is basically leveraging a graph search, so you ask the peers that you already know for their neighbour list, they will share the neighbour list and then in a process you would find all the peers in the graph. And for BitTorrent for example, you can easily find 5 million peers and in your own networks you can find more and more overtime.

Lastly and very easily for the game servers, you do not really need to scan, what you can do there is just ask the so?called master server for a complete list of servers because what you can imagine is like there is a central server and a normal player would contact the master server and ask okay, give me a list of servers from what I can choose, where to play. This is exactly the same mechanism that can abuse as an attacker, just give a whole list of servers and abuse this list of servers amplification.

For steam there is about 170,000 servers that they can abuse.

Another important fact for the attacker is that this kind of reconnaissance on the amplification server is not as easy. The question remains how long are these amplifiers really valid? So presumably I got a list today. And how long are the servers in the list still available in terms of days later on? And this is a graph which shows the churn of the system, basically. So, it starts on the left?hand side with 0, which is I just obtained a complete list of amplifiers for the protocols. And on the Y axis on the vertical axis you can see how many servers are still reachable after sometime.

So to give you one example. If you look at the upper line, these are the NTP servers. After one week on the X axis you can see there is still 95% of the NTP servers available, which is, from the attacker's point of view, is ideal. You get it [lace] and it's like next week still like valid. Even after seven weeks you can see that still 80% of the amplifiers on this list that you obtained seven weeks ago is valid. From attacker point of view this is really a nice feature. Where is the other protocols, the churn is much higher, so ever a week there you will see that 50% of the list already dropped out. Probably because devices were taken off line because they obtained a new IP address.

We were really interested then in looking into the reason for these amplifiers and we tried to understand what particular systems these really are, and this made us, well, write a small fingerprinting tool that tried to grab information like the operating system, to grab the architecture of these systems, to understand what threat we are tunnel dealing with. To give you a picture of this the following table summarises our fingerprinting results. It's quite diverse what operating system the amplifiers use with regard to the project. To give you an example if you look at NTP you can see that about 50% of the system run about any kind of UNIX or Linux and another 40% run the Cisco IOS, which is mostly enabled by many routers of Cisco or some Cisco, ex?Cisco companies. And you can see that these are mostly like union IX?based systems. If you look at net buyers for example, you will see that the vastage or the in this case are Windows hosts, basically shows you that it's really like, a diverse landscape of amplifiers that we need to deal with.

Also, looking at the architectures for us was really interesting so we found many X86 systems but we also found MIPs, arm and all the architecture that you would expect for embedded devices, so this is like showing that you we have to deal with a really different landscape and it will take some time to fix this issue.

So enough about the attacker, let's talk about the defender. What can we do to defend the whole threat of amplification attacks? In principle, the solution is quite simple. So, the route cause of amplification attacks namely is IP address spoofing. So if there was not a single provider on earth that would allow IP spoofing we wouldn't have the problem of amplification attacks. So, in principle, what we should do as really the main goal is get rid of the problem of IP address spoofing, which means that everybody should validate the IP addresses at its exit. There is two initiatives which are presently very similar. One is called SAVE, the Source Address Verification Everywhere. The other one is this PB CP 38, which also talks about spoofing in the network.

Another solution that we can do is really fix particular protocols. Again, one third route causes amplification is that these servers normally do not verify the clients, the list they use UDP, you just trust the address of the packet, that is you just reply to the packet. But instead if we would have the handshake for the protocols, a proper handshake that would basically verify the client that sent the request, then we wouldn't have this issue at all. So one solution there is to switch to TCP, for example, which offers this beautiful handshake which verifies the source of the packets, which, of course, is not doable like for protocols that are existing since 20 years already and did not foresee this from the early stage on. But also there is possibilities in theory to implement hand shakes using UDP, so this is like a suggestion for all the new protocols that we currently invent like speedy, but it's probably not being adaptable for other protocols that have been around for 30 years already.

Another counter?measure is rate limiting and that's fairly successful if the number of amplifiers is rather low. So for example if you look at the connect 3 game servers there is only about 1,000 servers, and if you deploy rate limiting on these servers, it's able to limit the overall attack volume that an attacker can achieve. If you deploy rate limiting, for example, on 25 million DNS resolvers, like the impact that you can achieve with rate limiting is fairly low because even if any individual resolver would do rate limiting, the overall attack volume when all these 25 million amplifiers are abused at the same time, is again very low. So, again, you have like a huge attack volume even though you rate limit the individual resolvers.

Another thing that I did with the European research is to understand if numbers in your network are being abused as an amplifier. For example, if you look at the following graph you see that the middle the black note denoted with an M is the amplifier that is currently being abused by an attacker A, which is on the top right of the graph. So, A spoofs packets by, like, imitating the IP address of the victim, V, on the right?hand side. So, from the perspective of the amplifier M, you would see that there is some traffic coming from V and you would simply respond to this traffic. So, in principle, this is not really suspicious, because you do not see that there is some A spoofing the IP addresses.

What you do see however is that there is A, a very high bandwidth between the two hosts, for example, if M is a DNS server you wouldn't expect such a high bandwidth between a client of your DNS server and your DNS server. And second, what you can leverage is you can look at the asymmetry of the traffic volume. So in this particular example, the attacker sends 3 megabit per second of attack traffic and the server responds with 90 megabit per second of attack traffic. Which basically shows an a similar statutory of volumes by a factor of 30, which is fairly suspicious in the context of DNS over such a long run. So this gives you like small insites into your amplifiers in your network and you can basically like use your basic note flow data to find amplifiers like this.

Later on we divide into fixing certain protocols, and actually these protocols to fix protocols have been around since years already. So, for example, DNS. I mean, in principle, the problem of fixing DNS amplification is again very simple. Namely, limit all the resolvers to serve clients only from their own network and do not accept any requests from outside of the network. This is fairly trivial to implement in all the DNS resolvers out there. There is like at this moment couple re giving suggestions on how to exactly you can do this with config templates. So it's basically copy/pasting these config templates to ensure your resolver it not open any more. These recommendations have been out there since years, but still we have a large number of DNS resolvers and it's actually not going down, unfortunately.

So it's a good step to verify if your network has some old resolvers, you can go to this website, who is basically also scanning for aut?num resolvers and you can enter your network prefix or prefixes and then check if hosts in your network are aut?num resolvers, so it's fairly ?? just entering the website and checking your prefixes. It doesn't take a lot of time.

On the other hand, for the authoritative name servers, there is not really an option to close these from the outside, but, however, there is an initiative called Response Rate Limiting (RRL) which is now also implemented for the BIND DNS resolver which basically does some rate limiting on the diversity of the responses so it doesn't send many responses which contain the same content which is mostly what attackers abuse during one attack. You can have a look at these, this website, to check the details for this proposal but I suggest that if you operate it on authoritative name server, this is something you can definitely look at to get to it secured against amplification abuse.

For NTP, I just briefly mention this mon?list feature, and this is some optional feature which is only there for debugging purposes and NTP, basically it allows to you request the number of 600 most recent clients that I have used your NTP server. This is nothing that you really need not normal case, so, for NTP, is as simple as disabling this one particular mon?list feature in your configuration, and this again is fairly easy by just adding this single line to your NTP demon, which is the most Professor lent NTP demon out there and then you rate all these problems with the mon?list attacks. You will remain with small amplification vectors there because there is other protocol messages that it can abuse for NTP but this will give you like the worse attack, you will get rid of the worst attack in this case.

Again, there is an awesome website which is, this time, OpenNTPPProject.org, you can enter your prefixes and check whether your network actually has some NTP servers that should be shut down or configured like this. The important thing to note is that you will not break the NTP protocol even if you configure the NTP servers like this because as I said this feature is optional and it's not required for normal NTP synchronisation.

If you want, you can also do it then the network layer because these mon?list packages are really ?? packets are actually very easy to identify. Basically, you just look at any traffic that comes from UDP port 123, and then you look at the packet size and if the IP packet size is exactly 4468 bytes, that's an indicator that you can block get rid of the fact of the amplification attacks on your network. Again by doing so you wouldn't break the normal for NTP. I'm not saying it's ideal to do this but given the case that NTP has such a huge amplification factor it might be necessary when there is somebody under attack.

So let's dive into NTP a bit more. When I discovered the vulnerability last year I basically was kind of shocked because I saw it offers to huge amplification factor I want to do something good. The first thing I did is basically monitoring the number of NTP servers over time. I started in November last year, and then kept monitoring the number of NTP servers on a monthly basis and not only that, after a couple of weeks I tried to reach out to as many people with influence as possible, for example, we had an advisory by Mitre which created a CVE advisory for the vulnerability, then I contacted a couple of ?? like some German Dutch certs and also I prepared a list of IP addresses and sent these to many providers so that he could check even if we have systems in the network that are vulnerable. What you can see: Over time is the number of amplifiers went down quite significantly. So while if you look at this draft, if shows you two lines, let's focus on the lower line because this is the number of NTP amplifiers that are vulnerable to the mon?list attack. What you can see is that the the number of amplifiers goes down from 1.6 million to about 100,000 in February and is steadily growing and decreasing in terms of numbers. So this is like something that is working, something that the secret community has achieved so really the problem of NTP amplification will be solved I guess rather soon.

There is multiple reasons for this decrease. I think the first reason is a that there is really good advice on how you can configure your NTP servers. It's fairly easy to do and also there is like good communication in the community of IP addresses that are vulnerable to this particular attack. Then, also, many providers started to rate limit NTP traffic or to block certain mon?list packets as I just mentioned. This is also causing a large decrease in terms of numbers.

Then there is also marginal decrease in numbers because many people started blocking my scanner, because my scanner is actually fairly aggressive. It scans every week for a couple of protocols so I'm sure if you will have seen me in the past. If people started blocking my scanner but the effect there is really marginal in the range of 5% of those.

The decrease of NTP amplifiers is also interesting if you look at certain regions where we had problems with NTP and, there were remaining challenges that remain. For example, in ARIN, which is mostly the American region recollect the number of amplifiers went down from 1.1 million to 30,000, which is really a huge decrease, and there is only 3% of the NTP amplifiers still remaining. In other regions like for example LACNIC the decrease on real tif scale as much smaller. But still as you can see from this table recollect the number of decreases is significant during these couple of weeks.

To conclude my talk, we see that many protocols are actually vulnerable to amplification attacks. I have briefly summarised what NTP can do, what DNS can do but be an aware there is other protocols that can be used for amplification in the future. We can really mitigate individual protocols like we have shown with NTP, so there is huge decrease in NTP amplifiers but we have failed to do in other protocols like DNS where the number of amplifiers open resolvers is really steady over the last years, in the range of 20 to 25 millions, so cannot really see any effect in the DNS there.

Thank you for your attention and I'm open for any questions.

CHAIR: We have time for just about one question if there is anything.

AUDIENCE SPEAKER: Stephane Bortzmeyer. It's not a question, it's a remark about your bandwidth amplification factor recollect it's a nice idea to give a name to these factors, it looks very nice, but the BoF you define it by the ratio of UDP pay load. It could have been, it could have been a ratio of amplification per load ?? application per load or the entire packet because from the point of view of the attacker, you pay for a complete packet with ethernet address and the IP address so using just UDP to define the factor, next to the amplification factor looks worse than what it is really.

CHRISTIAN ROSSOW: The reason I did this is, when we switch from v4 to IPv6, these amplification factors that I found don't really change. But you are right, if you also include like IP header and ethernet packets, you would have severely less amplification factors, to give you one example, the NTP amplification factor would decrease from 5,000 to 1,000 it's really a sharp decrease, but still, like the relative ratios remain the same, so it gives you a comparison of what protocol is worse than another protocol.

AUDIENCE SPEAKER: On DNS because I don't know anything else. The 25 or 26 million open resolver is actually open proxies they are mostly from home gateways, and because DNS is such a weird protocol, it's just, the actual resolvers that come to the authorities are only something like 250,000, so it's 1%. So, probably it's possible to do something with that 1% to kind of defer the attack and still interest the open proxies there because you can't close them. Customers have to exchange their devices.

CHRISTIAN ROSSOW: That's probably a very hard process, so what we do is also giving data to like organisations like Shadowserver which we hope passed this data on to I speed which we hope passed the data onto the end consumers, but, yeah, as we see it's apparently not working that well.

AUDIENCE SPEAKER: [Arin Caplin]. You almost took my line before, especially the home gateways, we have a big problem, because they run DNS mask and if they have a problem they blink. If they don't have a problem, they blink. And ask anyone ?? of your circle of friends who would usually update the firmware on their Wi?Fi router? Almost none of them except a few people here in this room maybe who play around with open ?? so this is a huge problem and I think I would like to encourage people in this room to actually find ways to detect these DNS mask based systems, and one thing that worked in a small wireless roof note ISP that I have been building in Vienna is that this meshed roof note wireless ISP gets its bandwidth over Wi?Fi, so, while all the DNS amplification attacks were going, ongoing, people's Internet speed dropped dramatically. So, then we informed people how to fix that, and we got from 250 open recurve name servers to 3 in a couple of weeks. So, that's another trick to do it.

CHRISTIAN ROSSOW: In response to this, is that finger?printing DNS servers is fairly hard. Because, NTP for example gives you a whole bunch of information, what system is running on but DNS servers do not and this makes it really hard for us to fingerprint these systems through a detail of router vendor to the detail of system being run. Whatever we see a vendor that runs like a Bogon use open DNS resolver we contact this vendor and try to get feedback from them why they are doing this. This may change, but, as you said, nobody is really updating the route server framework, this is take five to ten years when everybody replaces their old router.

AUDIENCE SPEAKER: Two quick comments, about the mitigation, so, BIND has implemented rate limiting, but you can rate limit with IP tables, so if you are using Linux you can rate link at any UTP?based protocol. The second comment on limiting blocking for 468 line of packet for UDP. I think it's not a good idea because if the our crowds the IP option to the packet, it makes the packet longer and I think it's not ?? it won't block the attack.

CHRISTIAN ROSSOW: On the first one on the DNS thing, BIND has rate limiting but as I said, it's not really effective if you have 25 million resolvers that do individual rate limiting. So, the problem here is that you can stay under the rate limiting when a single DNS resolver, but if you abuse 25 million at the same time, you still have a high attack volume. So, this is not really working for all resolvers, it does work for other name servers but not really for the resolvers unfortunately.

CHAIR: Okay. Thank you very much. I'm afraid we don't have really more time. So thank you, Christian, for your excellent presentation.

(Applause)

So if you rate the presentations, you can win a prize, and the RIPE NCC have selected a winner, randomly selected, so if Monika Kdanova from Casablanca is here, yes, if you go to the registration desk, you have won a prize. You can go and get your coffee now.

(Coffee break)

Connect with RIPE 68:

Local Host

Platinum Sponsor

Gold Sponsors

Silver Sponsors

Supporting Sponsors

Become a Sponsor