Security Concepts

Acest text este depozitat (și) aici din motive de siguranță. 
Recomandarea noastră este să respectați dorința autorului exprimată în secțiunea 1.1

Security Concepts

travis+security@subspacefield.org

Abstract

This is an online book about computer, network, technical, physical, information and cryptographic security. It is a labor of love, incomplete until the day I am finished.

Table of Contents

1 Metadata

The books that help you most are those which make you think the most. The hardest way of learning is that of easy reading; but a great book that comes from a great thinker is a ship of thought, deep freighted with truth and beauty.
— Theodore Parker

1.1 Copyright and Distribution Control

Kindly link a person to it instead of redistributing it, so that people may always receive the latest version. However, even an outdated copy is better than none. The PDF version is preferred and more likely to render properly (especially graphics and special mathematical characters), but the HTML version is simply too convenient to not have it available. The latest version is always here:
This is a copyrighted work, with some rights reserved. This work is licensed under the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License. This means you may redistribute it for non-commercial purposes, and that you must attribute me properly (without suggesting I endorse your work). For attribution, please include a prominent link back to this original work and some text describing the changes. I am comfortable with certain derivative works, such as translation into other languages, but not sure about others, so have yet not explicitly granted permission for all derivative uses. If you have any questions, please email me and I’ll be happy to discuss it with you.

1.2 Goals

I wrote this paper to try and examine the typical problems in computer security and related areas, and attempt to extract from them principles for defending systems. To this end I attempt to synthesize various fields of knowledge, including computer security, network security, cryptology, and intelligence. I also attempt to extract the principles and implicit assumptions behind cryptography and the protection of classified information, as obtained through reverse-engineering (that is, informed speculation based on existing regulations and stuff I read in books), where they are relevant to technological security.

1.3 Audience

When I picture a perfect reader, I always picture a monster of courage and curiosity, also something supple, cunning, cautious, a born adventurer and discoverer.
— Friedreich Nietzsche

This is not intended to be an introductory text, although a beginner could gain something from it. The reason behind this is that beginners think in terms of tactics, rather than strategy, and of details rather than generalities. There are many fine books on computer and network security tactics (and many more not-so-fine books), and tactics change quickly, and being unpaid for this work, I am a lazy author. The reason why even a beginner may gain from it is that I have attempted to extract abstract concepts and strategies which are not necessarily tied to computer security. And I have attempted to illustrate the points with interesting and entertaining examples and would love to have more, so if you can think of an example for one of my points, please send it to me!
I’m writing this for you, noble reader, so your comments are very welcome; you will be helping me make this better for every future reader. If you send a contribution or comment, you’ll save me a lot of work if you tell me whether you wish to be mentioned in the credits (see 39↓) or not; I want to respect the privacy of anonymous contributors. If you’re concerned that would be presumptuous, don’t be; I consider it considerate of you to save me an email exchange. Security bloggers will find plenty of fodder by looking for new URLs added to this page, and I encourage you to do it, since I simply don’t have time to comment on everything I link to. If you link to this paper from your blog entry, all the better.

1.4 About This Work

I have started this book with some terminology as a way to frame the discussion. Then I get into the details of the technology. Since this is adequately explained in other works, these sections are somewhat lean and may merely be a list of links. Then I get into my primary contribution, which is the fundamental principles of security which I have extracted from the technological details. Afterwards, I summarize some common arguments that one sees among security people, and I finish up with some of my personal observations and opinions.

1.5 On the HTML Version

Since this document is constantly being revised, I suggest that you start with the table of contents and click on the subject headings so that you can see which ones you have read already. If I add a section, it will show up as unread. By the time it has expired from your browser’s history, it is probably time to re-read it anyway, since the contents have probably been updated.
See the end of this page for the date it was generated (which is also the last update time). I currently update this about once every two weeks.
Some equations may fail to render in HTML. Thus, you may wish to view the PDF version instead.

1.6 About Writing This

Part of the challenge with writing about this topic is that we are always learning and it never seems to settle down, nor does one ever seem to get a sense of completion. I consider it more permanent and organized than a blog, more up-to-date than a book, and more comprehensive and self-contained than most web pages. I know it’s uneven; in some areas it’s just a heading with a paragraph, or a few links, in other places it can be as smoothly written as a book. I thought about breaking it up into multiple documents, so I could release each with much more fanfare, but that’s just not the way I write, and it makes it difficult to do as much cross-linking as I’d like.
This is to my knowledge the first attempt to publish a computer security book on the web before printing it, so I have no idea if it will even be possible to print it commercially. That’s okay; I’m not writing for money. I’d like for the Internet to be the public library of the21 st century, and this is my first significant donation to the collection. I am reminded of the advice of a staffer in the computer science department, who said, “do what you love, and the money will take care of itself”.
That having been said, if you wanted towards the effort, you can help me defray the costs of maintaining a server and such by visiting our donation page. If you would like to donate but cannot, you may wait until such a time as you can afford to, and then give something away (i.e. pay it forward).

1.7 Tools Used To Create This Book

I use LyX, but I’m still a bit of a novice. I have a love/hate relationship with it and the underlying typesetting language LaTeX.

2 Security Properties

What do we mean by secure? When I say secure, I mean that an adversary can’t make the system do something that its owner (or designer, or administrator, or even user) did not intend. Often this involves a violation of a general security property. Some security properties include:
confidentiality refers to whether the information in question is disclosed or remains private.
integrity refers to whether the systems (or data) remain uncorrupted. The opposite of this is malleability, where it is possible to change data without detection, and believe it or not, sometimes this is a desirable security property.
availability is whether the system is available when you need it or not.
consistency is whether the system behaves the same each time you use it.
auditability is whether the system keeps good records of what has happened so it can be investigated later. Direct-record electronic voting machines (with no paper trail) are unauditable.
control is whether the system obeys only the authorized users or not.
authentication is whether the system can properly identify users. Sometimes, it is desirable that the system cannot do so, in which case it is anonymous or pseudonymous.
non-repudiation is a relatively obscure term meaning that if you take an action, you won’t be able to deny it later. Sometimes, you want the opposite, in which case you want repudiability (“plausible deniability”).
Please forgive the slight difference in the way they are named; while English is partly to blame, these properties are not entirely parallel. For example, confidentiality refers to information (or inferences drawn on such) just as program refers to an executable stored on the disk, whereas control implies an active system just as process refers to a running program (as they say, “a process is a program in motion”). Also, you can compromise my data confidentiality with a completely passive attack such as reading my backup tapes, whereas controlling my system is inherently detectable since it involves interacting with it in some way.

2.1 Information Security is a PAIN

You can remember the security properties of information as PAIN; Privacy, Authenticity, Integrity, Non-repudiation.

2.2 Parkerian Hexad

There is something similar known as the “Parkerian Hexad”, defined by Donn B. Parker, which is six fundamental, atomic, non-overlapping attributes of information that are protected by information security measures:
  1. confidentiality
  2. possession
  3. integrity
  4. authenticity
  5. availability
  6. utility

2.3 Pentagon of Trust

  1. Admissibility (is the remote node trustworthy?)
  2. Authentication (who are you?)
  3. Authorization (what are you allowed to do?)
  4. Availability (is the data accessible?)
  5. Authenticity (is the data intact?)

2.4 Security Equivalency

I consider two objects to be security equivalent if they are identical with respect to the security properties under discussion; for precision, I may refer to confidentiality-equivalent pieces of information if the sets of parties to which they may be disclosed (without violating security) are exactly the same (and conversely, so are the sets of parties to which they may not be disclosed). In this case, I’m discussing objects which, if treated improperly, could lead to a compromise of the security goal of confidentiality. Or I could say that two cryptosystems are confidentiality-equivalent, in which case the objects help achieve the security goal. To be perverse, these last two examples could be combined; if the information in the first example was actually the keys for the cryptosystem in the second example, then disclosure of the first could impact the confidentiality of the keys and thus the confidentiality of anything handled by the cryptosystems. Alternately, I could refer to access-control equivalence between two firewall implementations; in this case, I am discussing objects which implement a security mechanism which helps us achieve the security goal, such as confidentiality of something.

2.5 Other Questions

  1. Secure to whom? A web site may be secure (to its owners) against unauthorized control, but may employ no encryption when collecting information from customers.
  2. Secure from whom? A site may be secure against outsiders, but not insiders.

3 Security Models

I intend to expand this section when I have some time.
Related information in Operating System Access Control (12.3↓).

4 Security Concepts

There is no security on this earth, there is only opportunity.
— General Douglas MacArthur (1880-1964)

These are important concepts which appear to apply across multiple security domains.

4.1 The Classification Problem

Many times in security you wish to distinguish between classes of data. This occurs in firewalls, where you want to allow certain traffic but not all, and in intrusion detection where you want to allow benign traffic but not allow malicious traffic, and in operating system security, we wish to allow the user to run their programs but not malware (see 16.7↓). In doing so, we run into a number of limitations in various domains that deserve mention together.

4.1.1 Classification Errors

False Positives vs. False Negatives, also called Type I and Type II errors. Discuss equal error rate (EER) and its use in biometrics.
A more sophisticated measure is its Receiver Operating Characteristic curve, see:

4.1.2 The Base-Rate Fallacy

In The Base Rate Fallacy and its Implications for Intrusion Detection, the author essentially points out that there’s a lot of benign traffic for every attack, and so even a small chance of a false positive will quickly overwhelm any true positives. Put another way, if one out of every 10,001 connections is malicious, and the test has a 1% false positive error rate, then for every 1 real malicious connection there 10,000 benign connections, and hence 100 false positives.

4.1.3 Test Efficiency

In other cases, you are perfectly capable of performing an accurate test, but not on all the traffic. You may want to apply a cheap test with some errors on one side before applying a second, more expensive test on the side with errors to weed them out. In medicine, this is done with a “screening” test which has low false negatives, and then having concentrated the high risk population, you now diagnose with a more complex procedure with a low false positive rate because you’re now diagnosing a high-prevalence population. This is done in BSD Unix with packet capturing via tcpdump, which uploads a coarse filter into the kernel, and then applies a more expensive but finer-grained test in userland which only operates on the packets which pass the first test.

4.1.4 Incompletely-Defined Sets

As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality.
— Albert Einstein

Stop for a moment and think about the difficulty of trying to list all the undesirable things that your computer shouldn’t do. If you find yourself finished, then ask yourself; did you include that it shouldn’t attack other computers? Did you include that it shouldn’t transfer $1000 to a mafia-run web site when you really intended to transfer $100 to your mother? Did you include that it shouldn’t send spam to your address book? The list goes on and on.
Thus, if we had a complete list of everything that was bad, we’d block it and never have to worry about it again. However, often we either don’t know, or the set is infinite.
In some cases, it may be possible to define a list of good things (see 34.1↓); for example, the list of programs you might need to use in your job may be small, and so they could be enumerated. However, it is easy to imagine where whitelisting would be impossible; for example, it would be impractical to enumerate all the possible “good” network packets, because there’s just so many of them.
It is probably true that computer security is interesting because it is open-ended; we simply don’t know ahead of time whether something is good or bad.

4.1.5 The Guessing Hazard

So often we can’t enumerate all the things we would want to do, nor all the things that we would not want to do. Because of this, intrusion detection systems (see 16↓) often simply guess; they try to detect attacks unknown to them by looking for features that are likely to be present in exploits but not in normal traffic. At the current moment, you can find out if your traffic is passing through an IPS by trying to send a long string of 0x90 octets (x86 NOPs) in a session. This isn’t malicious by itself, but is a common letter with which people pad exploits (see 24.6↓). In this case, it’s a great example of a false positive, or collateral damage, generated through guilt-by-association; there’s nothing inherently bad about NOPs, it’s just that exploit writers use them a lot, and IPS vendors decided that made them suspicious. I’m not a big fan of these because I feel that it breaks functionality that doesn’t threaten the system, and that it could be used as evidence of malfeasance against someone by someone who doesn’t really understand the technology. I’m already irritated by the false-positives or excessive warnings about security tools from anti-virus software; it seems to alert to “potentially-unwanted programs” an absurd amount of the time; most novices don’t understand that the anti-virus software reads the disk even though I’m not running the programs, and that you have nothing to fear if you don’t run the programs. I fear that one day my Internet Service Provider will start filtering them out of my email or network streams, but fortunately they just don’t care that much.

4.2 Security Layers

I like to think of security as a hierarchy. At the base, you have physical security. On top of that is OS security, and on top of that is application security, and on top of that, network security. The width of each layer of the hierarchy can be thought of as the level of security assurance, so that it forms a pyramid.
You may have an unbeatable firewall, but if your OS doesn’t require a password and your adversary has physical access to the system, you lose. So each layer of the pyramid can not be more secure (in an absolute sense) as the layer below it. Ideally, each layer should be available to fewer adversaries than the layer above it, so that one has a sort of balance or risk equivalency.
  1. network security
  2. application/database security
  3. OS security
  4. hardware security
  5. physical security
In network security, we concern ourselves with nodes in networks (that is, individual computers), and do not distinguish between users of each system. In some sense, we are assigning rights to computers and not people. We are defining which computers may talk to which other computers, or perhaps even to which applications. This is often justified since it is usually easier to leverage one user’s access to gain another’s within the same system than to gain access to another system (but this is not a truism).
In application or database security, we are concerned about how software applications handle security. For example, most databases have notions of users, and one may allow certain users to access certain databases, tables, or rows and not others. It is assumed that the adversary is one of the users of the system, and the discussion centers around what that user can or cannot do within the application, assuming that the user cannot
In operating system security, we distinguish between users of the system, and perhaps the roles they are fulfilling, and only concern ourselves with activities within that computer. It is assumed that the adversary has some access, but less than full privileges on the system.
Hardware security receives little discussion in security circles, but as processors and chipsets get more complex, there are more vulnerabilities being found within them. In hardware security, we assume that the adversary has root-level access on the system, and discuss what that enables the adversary to do.
When we discuss physical security, we assume that the adversary may physically approach the campus, building, room, or computer. We tend to create concentric security zones around the system, and try to keep adversaries as far away from it as possible. This is because if an adversary gains physical, unmonitored access to the computer system, it is virtually impossible to maintain the security of the system. This kind of discussion is particularly interesting to designers of tamper-resistant systems, such as digital satellite TV receivers.

4.3 Privilege Levels

Here’s a taxonomy of some commonly-useful privilege levels.
  1. Anonymous, remote systems
  2. Authenticated remote systems
  3. Local unprivileged user (UID > 0)
  4. Administrator (UID 0)
  5. Kernel (privileged mode, ring 0)
  6. Hardware (TPM, ring -1, hypervisors, trojaned hardware)
Actual systems may vary, levels may not be strictly hierarchical, etc. Basically the higher the privilege level you get, the harder you can be to detect. The gateways between the levels are access control devices, analogous with firewalls.

4.4 What is a Vulnerability?

Now that you know what a security property is, what constitutes (or should constitute) a vulnerability? On the arguable end of the scale we have “loss of availability”, or susceptibility to denial of service (DoS). On the inarguable end of the scale, we have “loss of control”, which usually arbitrary code execution, which often means that the adversary can do whatever he wants with the system, and therefore can violate any other security property.
In an ideal world, every piece of software would state its assumptions about its environment, and then state the security properties it attempts to guarantee; this would be a security policy. Any violation of these explicitly-stated security properties would then be a vulnerability, and any other security properties would simply be “outside the design goals”. However, I only know of one piece of commonly-available software which does this, and that’s OpenSSL (http://oss-institute.org/FIPS_733/SecurityPolicy-1.1.1_733.pdf).

A vulnerability is a hole or a weakness in the application, which can be a design flaw or an implementation bug, that allows an attacker to cause harm to the stakeholders of an application. Stakeholders include the application owner, application users, and other entities that rely on the application. The term “vulnerability” is often used very loosely. However, here we need to distinguish threats, attacks, and countermeasures.

— OWASP Vulnerabilities Category (http://www.owasp.org/index.php/Category:Vulnerability)

Vulnerabilities can be divided roughly into two categories, implementation bugs and design flaws. Gary McGraw (http://www.cigital.com/~gem/), the host of the Silver Bullet Security Podcast (http://www.cigital.com/silverbullet/), reports that the vulnerabilities he finds are split into these two categories roughly evenly.

4.5 Vulnerability Databases

4.5.1 National Vulnerability Database

NVD is the U.S. government repository of standards based vulnerability management data represented using the Security Content Automation Protocol (SCAP). This data enables automation of vulnerability management, security measurement, and compliance. NVD includes databases of security checklists, security related software flaws, misconfigurations, product names, and impact metrics.
— NVD Home Page

4.5.2 Common Vulnerabilities and Exposures

International in scope and free for public use, CVE is a dictionary of publicly known information security vulnerabilities and exposures.
CVE’s common identifiers enable data exchange between security products and provide a baseline index point for evaluating coverage of tools and services.
— CVE Home Page

4.5.3 Common Weakness Enumeration

The Common Weakness Enumeration Specification (CWE) provides a common language of discourse for discussing, finding and dealing with the causes of software security vulnerabilities as they are found in code, design, or system architecture. Each individual CWE represents a single vulnerability type. CWE is currently maintained by the MITRE Corporation with support from the National Cyber Security Division (DHS). A detailed CWE list is currently available at the MITRE website; this list provides a detailed definition for each individual CWE.
— CWE Home Page

4.5.4 Open Source Vulnerability Database

OSVDB is an independent and open source database created by and for the community. Our goal is to provide accurate, detailed, current, and unbiased technical information.
— OSVDB Home Page

4.6 Accuracy Limitations in Making Decisions That Impact Security

On two occasions I have been asked, “Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?” In one case a member of the Upper, and in the other a member of the Lower, House put this question. I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.

— Charles Babbage

This is sometimes called the GIGO rule (Garbage In, Garbage Out). Stated this way, this seems self-evident. However, you should realize that this applies to systems as well as programs. For example, if your system depends on DNS to locate a host, then the correctness of your system’s operation depends on DNS. Whether or not this is exploitable (beyond a simple denial of service) depends a great deal on the details of the procedures. This is a parallel to the question of whether it is possible to exploit a program via an unsanitized input.
You can never be more accurate than the data you used for your input. Try to be neither precisely inaccurate, nor imprecisely accurate. Learn to use footnotes.

4.7 Rice’s Theorem

This appears to relate to the undecidability of certain problems related to arbitrary programs, of certain issues related to program correctness, and has important consequences like “no modern general-purpose computer can solve the general problem of determining whether or not a program is virus free”. A friend pointed out to me that the entire anti-virus industry depends on the public not realizing that this is proven to be an unsolvable (not just a difficult) problem. The anti-virus industry, when it attempts to generate signatures or “enumerate badness” (see 34.1↓), is playing a constant game of catch-up, usually a step or two behind their adversaries.
Unfortunately, really understanding and (even moreso) explaining decidability problems requires a lot of thinking, and I’m not quite up to the task at the moment, so I’ll punt.

5 Economics of Security

5.1 How Expensive are Security Failures?

Here are some of the examples I could dig up.

5.1.1 TJ Maxx

TJ Maxx was using WEP at their stores and suffered a major loss of data, and large fines:

5.1.2 Greek Cell Tapping Incident

The Greek telephone tapping case of 2004-2005, also referred to as Greek Watergate, involved the illegal tapping of more than 100 mobile phones on the Vodafone Greece network belonging mostly to members of the Greek government and top-ranking civil servants.
On October 19, 2007, Vodafone Greece was again fined €19 million by EETT, the national telecommunications regulator, for alleged breach of privacy rules.

5.1.3 VAServ/LxLabs

The discovery of 24 security vulnerabilities may have contributed to the death of the chief of LxLabs. A flaw in the company’s HyperVM software allowed data on 100,000 sites, all hosted by VAserv, to be destroyed. The HyperVM solution is popular with cheap web hosting services and the attacks are easy to reproduce, which could lead to further incidents.

5.1.4 CardSystems

5.1.5 Egghead Software

Egghead was hurt by a December 2000 revelation that hackers had accessed its systems and potentially compromised customer credit card data. The company filed for bankruptcy in August 2001. After a deal to sell the company to Fry’s Electronics for $10 million fell through, its assets were acquired by Amazon.com for $6.1 million.

In December 2000, the company’s IIS-based servers were compromised, potentially releasing credit card data of over 3.6 million people. In addition to poor timing near the Christmas season, the handling of the breach by publicly denying that there was a problem, then notifying Visa, who in turn notified banks, who notified consumers, caused the breach to escalate into a full blown scandal.
— Wikipedia

5.1.6 Heartland Payment Systems

5.1.7 Verizon Data Breach Study

Note that Verizon conducted the study, and one should not construe this section to mean that they had any data breaches themselves.

5.1.8 Web Hacking Incidents Database

5.1.9 DATALOSSdb

5.1.10 Data Breach Investigations Report

5.2 Abuse Detection and Response: A Cost-Benefit Perspective

As I mentioned earlier, abuse detection is a kind of classification problem (see 4.1↑), which will forever be an imprecise science.
In general, you want to balance the costs of false positives and false negatives. If we assume “rate” means “per unit of time”, or “per number of interactions with the outside world”, then the equation would be:
fprate*fpcost = fnrate*fncost
Note that the definitions are very important to the equation! The ratio of abuse or intrusion attempts to legitimate traffic is usually rather low, and so naively substituting “the chance of failing to recognize a valid abuse attempt” as the fprate above will give an incorrect result. This is related to the base-rate fallacy described above (see 4.1.2↑). What you probably want then is to define the abuse ratio (abrat) as the number of abuse attempts per incoming requests, and you get:
fprate = abrat*fpchance
fnrate = (1 − abrat)*fnchance
Thus, if we wish to avoid the term “rate” as being misleading, then the equation should really be:
abrat*fpchance*fpcost = (1 − abrat)*fnchance*fncost
Abuse detection (see 16↓) is all about the failure chances (and thus, rates as defined above). Abuse response choices (see 17↓) determine the cost. For example, anomaly detection will give a higher false positive rate (and lower false negative rate) than misuse detection (see 16.2↓).
If your response to abuse causes an alert (see 17.1↓) to be generated, and a human must investigate it, then the false positive cost will be high, so you might want to (for example) do some further validation of the detection event to lower the false positive rate. For example, if your IDS detected a Win32 attack against a Linux system, you might want to avoid generating an alert.
On the other hand, if you can cheaply block an abuser, and suffer no ill effects from doing so even if it was a false positive, then you can take a liberal definition of what you consider abusive. To use the above example, one might wish to taint the source (see 17.2.2↓) and shun him, even if the Win32 attack he launched could not have worked against the Linux box.
Intrusion detection is merely a subset of abuse detection, since an intrusion is only one kind of abuse of a system.
See also 35.7↓, 35.8↓.

6 Adversary Modeling

If you know the enemy and know yourself, you need not fear the result of a hundred battles.
If you know yourself but not the enemy, for every victory gained you will also suffer a defeat.
If you know neither the enemy nor yourself, you will succumb in every battle.

— Sun Tzu, The Art of War (http://en.wikipedia.org/wiki/The_Art_of_War)

After deciding what you need to protect (your assets), you need to know about the threats you wish to protect it against, or the adversaries (sometimes called threat agents) which may threaten it. Generally intelligence units have threat shops, where they monitor and keep track of the people who may threaten their operations. This is natural, since it is easier to get an idea of who will try and do something than how some unspecified person may try to do it, and can help by hardening systems in enemy territory more than those in safer areas, leading to more efficient use of resources. I shall call this adversary modeling.
In adversary modeling, the implicit assumptions are that you have a limited budget and the number of threats is so large that you cannot defend against all of them. So you now need to decide where to allocate your resources. Part of this involves trying to figure out who your adversaries are and what their capabilities and intentions are, and thus how much to worry about particular domains of knowledge or technology. You don’t have to know their name, location and social security number; it can be as simple as “some high school student on the Internet somewhere who doesn’t like us”, “a disgruntled employee” (as opposed to a gruntled employee), or “some sexually frustrated script-kiddie on IRC who doesn’t like the fact that he is a jerk who enjoys abusing people and therefore his only friends are other dysfunctional jerks like him”. People in charge of doing attacker-centric threat modeling must understand their adversaries and be willing to take chances by allocating resources against an adversary which hasn’t actually attacked them yet, or else they will always be defending against yesterday’s adversary, and get caught flat-footed by a new one.

6.1 Common Psychological Errors

The excellent but poorly titled [A]  book Stumbling on Happiness tells us that we make two common kinds of errors when reasoning about other humans:
  1. Overly different; if you looked at grapes all day, you’d know a hundred different kinds, and naturally think them very different. But they all squish when you step on them, they are all fruits and frankly, not terribly different at all. So too we are conditioned to see people as different because the things that matter most to us, like finding an appropriate mate or trusting people, cannot be discerned with questions like “do you like breathing?”. An interesting experiment showed that a description of how they felt by people who had gone through a process is more accurate in predicting how a person will feel after the process than a description of the process itself. Put another way, people assume that the experience of others is too dependent on the minor differences between humans that we mentally exaggerate.
  2. Overly similar; people assume that others are motivated by the same things they are motivated by; we project onto them a reflection of our self. If a financier or accountant has ever climbed mount Everest, I am not aware of it. Surely it is a cost center, yes?

6.2 Cost-Benefit

Often, the lower layers of the security hierarchy cost more to build out than the higher levels. Physical security requires guards, locks, iron bars, shatterproof windows, shielding, and various other things which, being physical, cost real money. On the other hand, network security may only need a free software firewall. However, what an adversary could cost you during a physical attack (e.g. a burglar looting your home) may be greater than an adversary could cost you by defacing your web site.

6.3 Risk Tolerance

We may assume that the distribution of risk tolerance among adversaries is monotonically decreasing; that is, the number of adversaries who are willing to try a low-risk attack is greater than the number of adversaries who are willing to attempt a high-risk attack to get the same result. Beware of risk evaluation though; while a hacker may be taking a great risk to gain access to your home, local law enforcement with a valid warrant is not going to be risking as much.
So, if you are concerned about a whole spectrum of adversaries, known and unknown, you may wish to have greater network security than physical security, simply because there are going to be more remote attacks.

6.4 Capabilities

You only have to worry about things to the extent they may lie within the capabilities of your adversaries. It is rare that adversaries use outside help when it comes to critical intelligence; it could, for all they know, be disinformation, or the outsider could be an agent-provocateur.

6.5 Sophistication Distribution

If they were capable, honest, and hard-working, they wouldn’t need to steal.

Along similar lines, one can assume a monotonically decreasing number of adversaries with a certain level of sophistication. My rule of thumb is that for every person who knows how to perform a technique, there are x people who know about it, where x is a small number, perhaps 3 to 10. The same rule applies to people with the ability to write an exploit versus those able to download and use it (the so-called script kiddies). Once an exploit is coded into a worm, the chance of a compromised host having been compromised by the worm (instead of a human who targets it specifically) approaches 100%.

6.6 Goals

We’ve all met or know about people who would like nothing more than to break things, just for the heck of it; schoolyard bullies who feel hurt and want to hurt others, or their overgrown sadist kin. Vandals who merely want to write their name on your storefront. A street thug who will steal a cell phone just to throw it through a window. I’m sure the sort of person reading this isn’t like that, but unfortunately some people are. What exactly are your adversary’s goals? Are they to maximize ROI (Return On Investment) for themselves, or are they out to maximize pain (tax your resources) for you? Are they monetarily or ideologically motivated? What do they consider investment? What do they consider a reward? Put another way, you can’t just assign a dollar value on assets, you must consider their value to the adversary.

7 Threat Modeling

Men of sense often learn from their enemies. It is from their foes, not their friends, that cities learn the lesson of building high walls and ships of war.
— Aristophanes

In technology, people tend to focus on how rather than who, which seems to work better when anyone can potentially attack any system (like with publicly-facing systems on the Internet) and when protection mechanisms have low or no incremental cost (like with free and open-source software). I shall call modeling these threat modeling (http://en.wikipedia.org/wiki/Threat_model).

7.1 Common Platform Enumeration

CPE is a structured naming scheme for information technology systems, software, and packages. Based upon the generic syntax for Uniform Resource Identifiers (URI), CPE includes a formal name format, a method for checking names against a system, and a description format for binding text and tests to a name.
— CPE Home Page

The first part of threat modelling should be, what is it I want to protect? And once you start to compile a list of things you wish to protect, you might want a consistent naming system for your computer assets. The CPE may help you here.

7.2 A Taxonomy of Privacy Breaches

In the above article, Daniel Solove suggests that breaches of privacy are not of a single type, but can mean a variety of things:
  • surveillance
  • interrogation
  • aggregation
  • identification
  • insecurity
  • secondary use
  • exclusion
  • breach of confidentiality
  • disclosure
  • exposure
  • increased accessibility
  • blackmail
  • appropriation
  • distortion
  • intrusion
  • decisional interference

7.3 Threats to Security Properties

An important mnemonic for remembering the threats to security properties, originally introduced when threat modeling, is STRIDE:
  • Spoofing
  • Tampering
  • Repudiation
  • Information disclosure
  • Denial of service
  • Elevation of privilege
Related links:

7.4 Quantifying Risk

Microsoft has a rating system for calculating risks (http://msdn.microsoft.com/en-us/library/ff648644.aspx). Its mnemonic is DREAD:
  • Damage potential
  • Reproducibility
  • Exploitability
  • Affected users
  • Discoverability

7.5 Attack Surface

Gnothi Seauton (“Know Thyself”)

— ancient Greek aphorism (http://en.wikipedia.org/wiki/Know_thyself)

When discussing security, it’s often useful to analyze the part which may interact with a particular adversary (or set of adversaries). For example, let’s assume you are only worried about remote adversaries. If your system or network is only connected to outside world via the Internet, then the attack surface is the parts of your system that interact with things on the Internet, or the parts of your system which accept input from the Internet. A firewall, then, limits the attack surface to a smaller portion of your systems by filtering some of your network traffic. Often, the firewall blocks all incoming connections.
Sometimes the attack surface is pervasive. For example, if you have a network-enabled embedded device like a web cam on your network that has a vulnerability in its networking stack, then anything which can send it packets may be able to exploit it. Since you probably can’t fix the software in it, you must then use a firewall to attempt to limit what can trigger the bug. Similarly, there was a bug in Sendmail that could be exploited by sending a carefully-crafted email through a vulnerable server. The interesting bit here is that it might be an internal server that wasn’t exposed to the Internet; the exploit was data-directed and so could be passed through your infrastructure until it hit a vulnerable implementation. That’s why I consistently use one implementation (not Sendmail) throughout my network now.
If plugging a USB drive into your system causes it to automatically run things like a standard Microsoft Windows XP installation, then any plugged-in device is part of the attack surface. But even if it does not, then by plugging a USB device in you could potentially overflow the code which handles the USB or the driver for the particular device which is loaded; thus, the USB networking code and all drivers are part of the attack surface if you can control what is plugged into the system.
Moreover, a recent vulnerability (http://it.slashdot.org/it/08/01/14/1319256.shtml) illustrates that when you have something which inspects network traffic, such as uPNP devices or port knocking daemons, then their code forms part of the attack surface.
Sometimes you will hear people talk about the anonymous attack surface; this is the attack surface available to everyone (on the Internet). Since this number of people is so large, and you usually can’t identify them or punish them, you want to be really sure that the anonymous attack surface is limited and doesn’t have any so-called “pre-auth” vulnerabilities, because those can be exploited prior to identification and authentication.

7.6 Attack Trees

The next logical step is to move from defining the attack surface to modeling attacks and quantify risk levels.

7.7 The Weakest Link

Amdahl’s law, also known as Amdahl’s argument, is named after computer architect Gene Amdahl, and is used to find the maximum expected improvement to an overall system when only part of the system is improved.

— Wikipedia (http://en.wikipedia.org/wiki/Amdahl%27s_law)

You are the weakest link, goodbye!

The Weakest Link (TV series)

Let us think of our security posture for whatever we’re protecting as being composed of a number of systems (or groups of systems possibly offering defense-in-depth). The strength of these systems to attack may vary. You may wish to pour all your resources into one, but the security will likely be broken at the weakest point, either by chance or by an intelligent adversary.
This is an analogy to Amdahl’s law, stated above, in that we can only increase our overall security posture by maintaining a delicate balance between the different defenses to attack vectors. Most of the time, your resources are best spent on the weakest area, which for some institutions (financial, military) is usually personnel.
The reasons you might not balance all security systems may include:
Economics matter here; it may be much cheaper and reliable to buy a firewall than put your employees through security training. Software security measures sometimes have zero marginal cost, but hardware almost always has a marginal cost.
Exposure affects your risk calculations; an Internet attack is much more likely than a physical attack, so you may put more effort into Internet defense than physical defense.
Capability implies in that organizations have varying abilities. For example, the military may simply make carrying a thumb drive into the facility a punishable offense, but a commercial organization may find that too difficult or unpopular to enforce. An Internet company, by contrast, may have a strong technical capability, and so might choose to write software to prevent the use of thumb drives.

8 Physical Security

When people think of physical security, these often are the limit on the strength of access control devices; I recall a story of a cat burglar who used a chainsaw to cut through victim’s walls, bypassing any access control devices. I remember reading someone saying that a deep space probe is the ultimate in physical security.

8.1 No Physical Security Means No Security

While the locks are getting tougher, the door and frame are getting weaker. A well-placed kick usually does the trick.
— a burglar

A couple of limitations come up without physical security for a system. For confidentiality, all of the sensitive data needs to be encrypted. But even if you encrypt the data, an adversary with physical access could trojan the OS and capture the data (this is a control attack now, not just confidentiality breach; go this far and you’ve protected against overt seizure, theft, improper disposal and such). So you’ll need to you protect the confidentiality and integrity of the OS, he trojans the kernel. If you protect the kernel, he trojans the boot loader. If you protect the boot loader (say by putting on a removable medium), he trojans the BIOS. If you protect the BIOS, he trojans the CPU. So you put a tamper-evident label on it, with your signature on it, and check it every time. But he can install a keyboard logger. So suppose you make a sealed box with everything in it, and connectors on the front. Now he gets measurements and photos of your machine, spends a fortune replicating it, replaces your system with an outwardly identical one of his design (the trojan box), which communicates (say, via encrypted spread-spectrum radio) to your real box. When you type plaintext, it goes through his system, gets logged, and relayed to your system as keystrokes. Since you talk plaintext, neither of you are the wiser.
The physical layer is a common place to facilitate a side-channel attack (see 31.2↓).

8.2 Data Remanence

I know what your computer did last summer.

Data remanence is the the residual physical representation of your information on media after you believe that you have removed it (definition thanks to Wikipedia, http://en.wikipedia.org/wiki/Data_remanence). This is a disputed region of technology, with a great deal of speculation, self-styled experts, but very little hard science.
Last time I looked most of the degaussers require 220V power and may not work on hard drives, due to their high coercivity.
As of 2006, the most definitive study seems to be the NIST Computer Security Division paper Guidelines for Media Sanitization (http://csrc.nist.gov/publications/nistpubs/800-88/NISTSP800-88_rev1.pdf). NIST is known to work with the NSA on some topics, and this may be one of them. It introduces some useful terminology:
disposing is the act of discarding media with no other considerations
clearing is a level of media sanitization that resists anything you could do at the keyboard or remotely, and usually involves overwriting the data at least once
purging is a process that protects against a laboratory attack (signal processing equipment and specially trained personnel)
destroying is the ultimate form of sanitization, and means that the medium can no longer be used as originally intended

8.2.1 Magnetic Storage Media (Disks)

The seminal paper on this is Peter Gutmann’s Secure Deletion of Data from Magnetic and Solid-State Memory (http://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html). In early versions of his paper, he speculated that one could extract data due to hysteresis effects even after a single overwrite, but on subsequent revisions he stated that there was no evidence a single overwrite was insufficient. Simson Garfinkel wrote about it recently in his blog (https://www.techreview.com/blog/garfinkel/17567/).
The NIST paper has some interesting tidbits in it. Obviously, disposal cannot protect confidentiality of unencrypted media. Clearing is probably sufficient security for 99% of all data; I highly recommend Darik’s Boot and Nuke (http://dban.sourceforge.net/), which is a bootable floppy or CD based on Linux. However, it cannot work if the storage device stops working properly, and it does not overwrite sectors or tracks marked bad and transparently relocated by the drive firmware. With all ATA drives over 15GB, there is a “secure delete” ATA command which can be accessed from hdparm within Linux, and Gordon Hughes has some interesting documents and a Microsoft-based utility (http://cmrr.ucsd.edu/people/Hughes/SecureErase.shtml). There’s a useful blog entry about it (http://storagemojo.com/2007/05/02/secure-erase-data-security-you-already-own/). In the case of very damaged disks, you may have to resort to physical destruction. However, with disk densities being what they are, even 1/125” of a disk platter may hold a full sector, and someone with absurd amounts of money could theoretically extract small quantities of data. Fortunately, nobody cares this much about your data.
Now, you may wonder what you can do about very damaged disks, or what to do if the media isn’t online (for example, you buried it in an underground bunker), or if you have to get rid of the data fast. I would suggest that encrypted storage (see 28.7↓) would almost always be a good idea. If you use it, you merely have to protect the confidentiality of the key, and if you can properly sanitize the media, all the better. Recently Simson Garfinkel re-discovered a technique for getting the data off broken drives; freezing them. Another technique that I have used is to replace the logic board with one from a working drive.

8.2.2 Semiconductor Storage (RAM)

Peter Gutmann’s Data Remanence in Semiconductor Devices (http://www.cypherpunks.to/~peter/usenix01.pdf) shows that if a particular value is held in RAM for extended periods of time, various processes such as electromigration make permanent changes to the semiconductor’s structure. In some cases, it is possible for the value to be “burned in” to the cell, such that it cannot hold another value.
Cold Boot Attack
Recently a Princeton team (http://citp.princeton.edu/memory/) found that the values held in DRAM decay in predictable ways after power is removed, such that one can merely reboot the system and recover keys for most encrypted storage systems (http://citp.princeton.edu/pub/coldboot.pdf). By cooling the chip first, this data remains longer. This generated much talk in the industry. This prompted an interesting overview of attacks against encrypted storage systems (http://www.news.com/8301-13578_3-9876060-38.html).
Direct Memory Access
It turns out that certain peripheral devices, notably Firewire, have direct memory access.
This means that you can plug something into the computer and read data directly out of RAM.
That means you can read passwords directly out of memory:
Reading RAM With A Laser

8.3 Smart Card Attacks

This section deserves great expansion.
Instead I’ll punt and point you at the latest USENIX conference on this:

9 Hardware Security

9.1 Introduction

Hardware security is a term I invented to describe the security models provided by a CPU (http://en.wikipedia.org/wiki/Central_processing_unit), associated chipset (http://en.wikipedia.org/wiki/Chipset) and peripheral hardware. The assumption here is that the adversary can create and execute program code of his own choosing, possibly as an administrator (root). As computer hardware and firmware (http://en.wikipedia.org/wiki/Firmware) becomes more complex, there will be more and more vulnerabilities found in it, so this section is likely to grow over time.
Each computer hardware architecture is going to have its own security models, so this discussion is going to be specific to the hardware platform under consideration.

9.2 Protection Rings

Most modern computer systems have at least two modes of operation; normal operation and privileged mode. The vast majority of software runs in normal mode, and the operating system, or more accurately the kernel, runs in privileged mode. Similarly, most of the functionality of the CPU is available in normal mode, whereas a small but significant portion, such as that related to memory management and communicating with hardware, is restricted to that operating in privileged mode.
Some CPU architectures, go farther and define a series of hierarchical protection domains that are often called protection rings (http://en.wikipedia.org/wiki/Ring_(computer_security)). This is a simple extrapolation of the two-level normal/privileged mode into multiple levels, or rings.

9.3 Operating Modes

The Intel architectures in particular has several operating modes. These are not privilege rings, but rather represent the state that the CPU is in, which affects how various instructions are interpreted

9.4 NX bit

The NX bit, which stands for No eXecute, is a technology used in CPUs to segregate areas of memory for use by either storage of processor instructions (or code) or for storage of data, a feature normally only found in Harvard architecture processors. However, the NX bit is being increasingly used in conventional von Neumann architecture processors, for security reasons.
An operating system with support for the NX bit may mark certain areas of memory as non-executable. The processor will then refuse to execute any code residing in these areas of memory. The general technique, known as executable space protection, is used to prevent certain types of malicious software from taking over computers by inserting their code into another program’s data storage area and running their own code from within this section; this is known as a buffer overflow attack.
Wikipedia

9.5 Supervisors and Hypervisors

9.6 Trusted Computing

9.7 Intel vPro

Not really a backdoor, but the wake-on-lan and remote management facilities could be used by an attacker.

9.8 Hardware Vulnerabilities and Exploits

10 Distributed Systems

10.1 Network Security Overview

The things involved in network security are called nodes. One can talk about networks composed of humans (social networks), but that’s not the kind of network we’re talking about here; I always mean a computer unless I say otherwise. Often in network security the adversary is assumed to control the network in whole or part; this is a bit of a holdover from the days when the network was radio, or when the node was an embassy in a country controlled by the adversary. In modern practice, this doesn’t seem to usually be the case, but it’d be hard to know for sure. In the application of network security to the Internet, we almost always assume the adversary controls at least one of the nodes on the network.
In network security, we can lure an adversary to a system, tempt them with something inviting; such a system is called a honeypot, and a network of such systems is sometimes called a honeynet. A honeypot may or may not be instrumented for careful monitoring; sometimes systems so instrumented are called fishbowls, to emphasize the transparent nature of activity within them. Often one doesn’t want to allow a honeypot to be used as a launch point for attacks, so outbound network traffic is sanitized or scrubbed; if traffic to other hosts is blocked completely, some people call it a jail, but that is also the name of an operating system security technology used by FreeBSD, so I consider it confusing.
To reduce a distributed system problem to a physical security (see 8↑) problem, you can use an air gap, or sneakernet between one system and another. However, the data you transport between them may be capable of exploiting the offline system. One could keep a machine offline except during certain windows; this could be as simple as a cron job which turns on or off the network interface via ifconfig. However, an offline system may be difficult to administer, or keep up-to-date with security patches.

10.2 Network Access Control: Packet Filters, Firewalls, Security Zones

Most network applications use TCP, a connection-oriented protocol, and they use a client/server model. The client initiates a handshake with the server, and then they have a conversation. Sometimes people use the terms client and server to mean the application programs, and other times they mean the node itself. Other names for server applications include services and daemons. Obviously if you can’t speak with the server at all, or (less obviously) if you can’t properly complete a handshake, you will find it difficult to attack the server application. This is what a packet filter does; it allows or prevents communication between a pair of sockets. A packet filter does not generally do more than a simple all-or-nothing filtering. Now, every computer can potentially have a network access control device, or packet filter, on it. For security, this would be the ideal; each machine defends itself, opening up the minimum number of ports to external traffic. However, tuning a firewall for minimum exposure can be a difficult, time-consuming process and so does not scale well. It would be better for network daemons to not accept connections from across the network, and there definitely has been a move this direction. In some cases, a packet filter would merely be redundant to a system which does not have any extraneous open ports.
The firewall was originally defined as a device between different networks that had different security characteristics; it was named after the barrier between a automobile interior and the engine, which is designed to prevent a engine fire from spreading to the passenger cabin. Nowadays, they could be installed on every system, protecting it from all other systems.
As our understanding of network security improved, people started to define various parts of their network. The canonical types of networks are:
  • Trusted networks were internal to your corporation.
  • An untrusted network may be the Internet, or a wifi network, or any network with open, public access.
  • Demilitarized zones (DMZs) were originally defined as an area for placing machines that must talk to nodes on both trusted and untrusted networks. At first they were placed outside the firewall but inside a border router, then as a separate leg of the firewall, and now in are defined and protected in a variety of ways.
What these definitions all have in common is that they end up defining security zones (this term thanks to the authors of Extreme Exploits). All the nodes inside a security zone have roughly equivalent access to or from other security zones. I believe this is the most important and fundamental way of thinking of network security. Do not confuse this with the idea that all the systems in the zone have the same relevance to the network’s security, or that the systems have the same impact if compromised; that is a complication and more of a matter of operating system security than network security. In other words, two systems (a desktop and your DNS server) may not be security equivalent, but they may be in the same security zone.

10.3 Network Reconnaissance: Ping Sweeps, Port Scanning

Typically an adversary needs to know what he can attack before he can attack it. This is called reconnaissance, and involves gathering information about the target and identifying ways in which he can attack the target. In network security, the adversary may want to know what systems are available for attack, and a technique such as a ping sweep of your network block may facilitate this. Then, he may choose to enumerate (get a list of) all the services available via a technique such as a port scan. A port scan may be ahorizontal scan (one port, many IP addresses) or vertical scan (one IP address, multiple ports), or some combination thereof. You can sometimes determine what service (and possibly what implementation) it is by banner grabbing or fingerprinting the service.
In an ideal world, knowing that you can talk to a service does not matter. Thus, a port scan should only reveal what you already assumed your adversary already knew. However, it is considered very rude, even antisocial, like walking down the street and trying to open the front door of every house or business that you pass; people will assume you are trying to trespass, and possibly illicitly copy their data.
Typical tools used for network reconnaissance include:

10.4 Network Intrusion Detection and Prevention

Most security-conscious organizations are capable of detecting most scans using [network] intrusion detection systems (IDS) or intrusion prevention systems (IPS); see 16↓.

10.5 Cryptography is the Sine Qua Non of Secure Distributed Systems

All cryptography lets you do is create trust relationships across untrustworthy media; the problem is still trust between endpoints and transitive trust.
— Marcus Ranum

Put simply, you can’t have a secure distributed system (with the normal assumptions of untrusted nodes and network links potentially controlled by the adversary) without using cryptography somewhere (“sine qua non” is Latin for “without which it could not be”). If the adversary can read communications, then to protect the confidentiality of the network traffic, it must be encrypted. If the adversary can modify network communication, then it must have its integrity protected and be authenticated (that is, to have the source identified). Even physical layer communication security technologies, like the KLJN cipher, quantum cryptography, and spread-spectrum communication, use cryptography in one way or another.
I would go farther and say that performing network security decisions on anything other than cryptographic keys is never going to be as strong as if it depended on cryptography. Very few Internet adversaries currently have the capability to arbitrarily route data around. Most cannot jump between VLANs on a tagged port. Some don’t even have the capability to sniff on their LAN. But none of the mechanisms preventing this are stronger than strong cryptography, and often they are much weaker, possibly only security through obscurity. Let me put it to you this way; to support a general argument otherwise, think about how much assurance a firewall has that a packet claiming to be from a given IP address is actually from the system the firewall maintainer believes it to be. Often these things are complex, and way beyond his control. However, it would be totally reasonable to filter on IP address first, and only then allow a cryptographic check; this makes it resistant to resource consumption attacks from anyone who cannot spoof a legitimate IP address (see 4.1.1↑).

10.6 Hello, My Name is 192.168.1.1

Humans are incapable of securely storing high-quality cryptographic keys, and they have unacceptable speed and accuracy when performing cryptographic operations. (They are also large, expensive to maintain, difficult to manage, and they pollute the environment. It is astonishing that these devices continue to be manufactured and deployed. But they are sufficiently pervasive that we must design our protocols around their limitations).
— Network Security / PRIVATE Communication in a PUBLIC World by Charlie Kaufman, Radia Perlman, & Mike Speciner (Prentice Hall 2002; p.237)

Because humans communicate in slowly, in plaintext, and don’t plug into a network, we consider the nodes within the network to be computing devices. The system a person interacts with has equivalency with them; break into the system administrator’s console, and you have access to anything he or she accesses. In some cases, you may have access to anything he or she can access. You may think that the your LDAP or Kerberos server is the most important, but isn’t the node of the guy who administers it just as critical? This is especially true if OS security is weak and any user can control the system, or if the administrator is not trusted, but it is also convenient because packets do not have user names, just source IPs. When some remote system connects to a server, unless both are under the control of the same entity, the server has no reason to trust the remote system’s claim about who is using it, nor does it have any reason to treat one user on the remote system different than any other.

10.7 Source Tapping; The First Hop and Last Mile

One can learn a lot more about a target by observing the first link from them than from some more remote place. That is, the best vantage point is one closest to the target. For this reason, the first hop is far more critical than any other. An exception may involve a target that is more network-mobile than the eavesdropper. The more common exception is tunneling/encryption (to include tor and VPN technologies); these relocate the first hop somewhere else which is not physically proximate to the target’s meat space coordinates, which may make it more difficult to locate.
Things to consider here involve the difficulty of interception, which is a secondary concern (it is never all that difficult). For example, it is probably less confidential from the ISP to use an ISP’s caching proxy than to access the service directly, since most proxy software makes it trivial to log the connection and content; however, one should not assume that one is safe by not using the proxy (especially now that many do transparent proxying). However, it is less anonymous from the remote site to access the remote site directly; using the ISP’s proxy affords some anonymity (unless the remote site colludes with the ISP).

10.8 Security Equivalent Things Go Together

One issue that always seems to come up is availability versus other goals. For example, suppose you install a new biometric voice recognition system. Then you have a cold and can’t get in. Did you prioritize correctly? Which is more important? Similar issues come up in almost every place with regard to security. For example, your system may authenticate users versus a global server, or it may have a local database for authentication. The former means that one can revoke a user’s credentials globally immediately, but also means that if the global server is down, nobody can authenticate. Attempts to get the best of both worlds (“authenticate locally if global server is unreachable”) often reduce to availability (adversary just DOSes link between system and global server to force local authentication).
My philosophy on this is simple; put like things together. That is, I think authentication information for a system should be on the system. That way, the system is essentially a self-contained unit. By spreading the data out, one multiplies potential attack targets, and reduces availability. If someone can hack the local system, then being able to alter a local authentication database is relatively insignificant.

10.9 Man In The Middle

How do we detect MITM or impersonation in web, PGP/GPG, SSH contexts?
The typical process for creating an Internet connection involves a DNS resolution at the application layer (unless you use IP addresses), then sending packets to the IP address (at the network layer), which have to be routed; at the link layer, ARP typically is used to find the next hop at each stage, and then bits are marshalled between devices at the physical layer. Each of these steps creates the opportunity for a man-in-the-middle attack.

10.9.1 DNS MITM Issues

10.9.2 IP Routing MITM Issues

The adversary could announce bogus BGP routes (http://tools.ietf.org/html/rfc4272).
The adversary could naturally sit between you and the remote system.

10.9.3 Link Layer MITM Issues

The adversary could use ARP spoofing or poisoning, such as with these tools:

10.9.4 Physical Layer MITM Issues

Tapping the wire (or listening to wireless)
There is something used by the military called an identification friend or foe (IFF) device. You can read about it on the Wikipedia page (http://en.wikipedia.org/wiki/Identification_friend_or_foe). What is interesting is that it can be defeated using a MITM attack; the challenger sends his challenge towards the adversary, and the adversary relays the challenge to a system friendly to the challenger, and relays the response back. What is interesting here is that, in this case, the IFF device can enforce a reasonable time limit, so that a MITM attack fails due to speed-of-light constraints. In this case, it could be considered a kind of “somewhere you are” authentication factor (see 11.8↓).

10.9.5 Cryptographic Methods

There are cryptographic mechanisms that may be used to detect MITM attacks; see 28.9↓.

10.10 Network Surveillance

10.11 Push vs. Pull Updates

When moving data between systems on a regular basis, I find myself wondering whether it is better to push data or to have the system pull it. In a push model, the pushing system connects to an open port on the destination, which implies that there is the possibility that the destination system could have data pushed to it from another machine. In a pull model, the machine asks for the data it wants, and the sender of the data must have an open port. This is a complex subject. Sometimes push models are inadequate because one of the recipient machines may be unreachable when you are doing the push. Sometimes pull models are inadequate because the pull may come too late for an important security update. Sometimes you need both, where you push to a class of systems and any which are down automagically request the data when they come back up. With SSH, rsync, and proper key management, this is not really a significant security issue, but with other systems implementing their own file distribution protocols, this could be a major security hole. Be careful that any file transfer you establish is a secure one.

10.12 DNS Issues

DNS is perhaps the most widely deployed distributed system, and it can be abused in many ways. The main investigator of DNS abuse is Dan Kaminsky; he can tunnel SSH sessions over DNS, store data in DNS like a very fast FTP server, use it to distribute real-time audio data, and snoop on caches to see if you’ve requested a certain DNS name.

10.13 Network Topology

Organizational systems prone to intrusion, or with porous perimeters, should make liberal use of internal firewalls. This applies to organizational structures as well, so that organizations prone to personnel infiltration, should make use of the revolutionary cell structure for their communication topology.
It is possible to triangulate the location of a system using ping times from three locations. Note that it’s not the physical locations that you use to triangulate, but the network locations; it’s no good if all three share the same long pipe to the target. You need separate paths that converge as close to the target as possible.

11 Identification and Authentication

Identification is necessary before making any sort of access control decisions. Often it can reduce abuse, because an identified individual knows that if they do something there can be consequences or sanctions. For example, if an employee abuses the corporate network, they may find themselves on the receiving end of the sysadmin’s luser attitude readjustment tool (LART). I tend to think of authentication as a process you perform on objects (like paintings, antiques, and digitally signed documents), and identification as a process that subjects (people) perform, but in network security you’re really looking at data created by a person for the purpose of identifying them, so I use them interchangeably.

11.1 Identity

Sometimes I suspect I’m not who I think I am.
Ghost in the Shell

An identity, for our purposes, is an abstract concept; it does not map to a person, it maps to a persona. Some people call this a digital ID, but since this paper doesn’t talk about non-digital identities, I’m dropping the qualifier. Identities are different fromauthenticators, which are something you use to prove your identity. An identifier is shorthand, a handle; like a pointer to the full identity.
To make this concrete, let us take the Unix operating system as an example. Your identity corresponds to a row in the /etc/passwd file. Your identifier is your username, which is used to look up your identity, and your password is an authenticator.

11.2 Identity Management

In relational database design, it is considered a good practice for the primary key (http://en.wikipedia.org/wiki/Primary_key) of a table to be an integer, perhaps a row number, that is not used for anything else. That is because the primary key is used as an identifier for the row; it allows us to modify the object itself, so that the modification occurs in all use cases simultaneously (for a normalized database). Most competent DBAs realize that people change names, phone numbers, locations, and so on; they may even change social security numbers. They also realize that people may share any of these things (even social security numbers are not necessarily unique, especially if they lie about it). So to be able to identify a person across any of these changes, you need to use a row number. The exact same principle applies with security systems; you should always keep the identifiers separate from identities and authenticators.
This is good, because the authenticator (password) may be changed without losing the idea of the identity of the person. However, there are subtle gotchas. In Unix, the username is mapped to a user ID (UID), which is the real way that Unix keeps track of identity. It isn’t necessarily a one-to-one mapping. Also, a poor system administer may reassign an unused user ID without going through the file system and looking for files owned by the old user, in which case their ownership is silently reassigned.
PGP/GPG made the mistake of using a cryptographic key as an identifier. If one has to revoke that key, one basically loses anything (such as signatures) which applied to that key, and the trust that other people have indicated towards that key. And if you have multiple keys, friends of yours who have all of them cannot treat them all as equivalent, since GPG can’t be told that they are associated to the same identity, because the keys are the identity. Instead, they must manage statements about you (such as how much they trust you to act as an introducer) on each key independently.
Some web sites are using email addresses as identities, which makes life difficult when it changes; in some cases, you are effectively a different person if you change email addresses. In my opinion, identifiers like email addresses should only serve to look up an identity; it should not be the identity.
For an excellent paper on identity in an Internet context, see:

11.3 The Identity Continuum

Identification can range from fully anonymous to pseudonymous, to full identification. Ensuring identity can be expensive, and is never perfect. Think about what you are trying to accomplish. Applies to cookies from web sites, email addresses, “real names”, and so on.

11.4 Problems Remaining Anonymous

In cyberspace everyone will be anonymous for 15 minutes.
— Graham Greenleaf

What can we learn from anonymizer, mixmaster, tor, and so on? Often one can de-anonymize. Some people have de-anonymized search queries this way, and census data, and many more data sets that are supposed to be anonymous.

11.5 Problems with Identifying People

  • Randomly-Chosen Identity
  • Fictitious Identity
  • Stolen Identity

11.6 What Authority?

Does it follow that I reject all authority? Far from me such a thought. In the matter of boots, I refer to the authority of the bootmaker; concerning houses, canals, or railroads, I consult that of the architect or the engineer.

— Mikhail Bakunin, What is Authority? 1882 (http://www.panarchy.org/bakunin/authority.1871.html)

When we are attempting to identify someone, we are relying upon some authority, usually the state government. When you register a domain name with a registrar, they record your personal information in the WHOIS database; this is the system of record(http://en.wikipedia.org/wiki/System_of_record). No matter how careful we are, we can never have a higher level of assurance than this authority has. If the government gave that person a false identity, or the person bribed a DMV clerk to do so, we can do absolutely nothing about it. This is an important implication of the limitations of accuracy (see 4.6↑).

11.7 Goals of Authentication

Authentication serves two related goals; it is designed to allow us in while keeping other people out. These goals are two sides of the same coin, but have different requirements. The goal to allow us in requires that authentication be convenient, while the goal of keeping others out requires that authentication be secure. These goals are often in direct conflict with each other and an example of a more general trade-off between convenience and security.

11.8 Authentication Factors

There are many ways you can prove your identity to a system. They may include one or more authentication factors such as:
something you are like biometric signatures such as the pattern of capillaries on your retina, your fingerprints, etc.
something you have like a token, physical key, or thumb drive
something you know like a passphrase or password
somewhere you are if you put a GPS device in a computer, or did direction-finding on transmissions, or simply require a person to be physically present somewhere to operate the system
somewhere you can be reached like a mailing address, network address, email address, or phone number
At the risk of self-promotion, I want to point out that, to my knowledge, the last factor has not been explicitly stated in computer security literature, although it is demonstrated every time a web site emails you your password, or every time a financial company mails something to your home.

11.9 Authenticators

My voice is my passport; verify me.

Sneakers, the motion picture

The oldest and still most common method for authenticating individuals consists of using passwords. However, there are many problems with using passwords, and I humbly suggest that people start to design systems with the goal of minimizing the use of passwords, passphrases, and other reusable authenticators.

11.9.1 People Pick Lousy Passwords

The first and most important issue is that people pick lousy passwords.
A current plague of security problems stems from rampant password guessing for remote services (specifically, ssh). There have been a number of suggestions for dealing with this, as we shall see.

11.9.2 Picking Secure Passwords

One thing that most people could do to improve their security is to pick better passwords:

11.9.3 Preventing Weak Passwords

One invaluable tool for dealing with password guessing involves weeding out weak passwords. No password lockouts will help you when your users pick passwords such as “password” and an adversary guesses that on the first try.
There are two ways of doing this; in the older post facto method, one tries to crack the password hashes. However, it is desirable to store passwords only after they have been passed through a one-way function, or hash. In this case, it’s often much more efficient to check them before hashing than to try to crack them post-facto; however, you must locate and guard all the places passwords can be set.

11.9.4 Remembering Passwords

The problem with preventing weak passwords is that if the passwords are hard to guess, they are hard to remember, and users may write them down on post-it notes or simply forget them more often. More sophisticated users may store them in their wallets, or in a password database program like Password Safe:

11.9.5 Password Guessing Lockouts

Most systems employ some sort of abuse detection (lockout) to prevent guessing passwords. In the naive model, this checks for multiple guesses on a single username. For example, the Unix console login has you enter a username, and then prompts for a password; if you get the password wrong three times, it freezes the terminal for a period of time. Guessing multiple passwords for one username is sometimes called the forward hack. Some network login programs like SSH do the same thing, with the sshd_config entry MaxAuthTries determining how many guesses are possible. As a result, some SSH brute-forcing programs try the same password on multiple accounts, the so-called reverse hack.
It also opens up the door for a denial-of-service attack; the adversary can try various passwords until the account gets locked, denying the legitimate owner in.
One other problem with this is that unless one can centralize all authentication in something like PAM (pluggable authentication modules), then an adversary may simply multiplex guesses over different services which all consult the same authentication information. One such example is THC’s Hyda:

11.9.6 Limited Password Lifetimes

Some systems require you to change your password frequently, minimizing the amount of time it is good for if it is guessed, ostensibly making it less valuable. The problem with this is that once a password is guessed, the adversary is likely to use it right away, and perhaps set up a back door for later entry into the system. It’s very difficult to detect a well-placed back door. This is also extremely inconvenient to users, and they often end up varying their passwords by some predictable mechanism.
There is another advantage to limited password lifetimes; if the passwords take a long time to guess or crack, then rotating them with a shorter time frame means that a cracked password is no longer valuable. This was more appropriate when any user could read the hashed passwords from the file /etc/passwd; modern systems keep them in another file and don’t make it readable to anyone but root, meaning that cracking password hashes would have to occur after cracking the root account, for later access to the same system or to other systems where the users might have the same passwords.

11.9.7 Password Reset Procedure

Enforcing difficult-to-guess passwords and limited password lifetimes increases the chance that users will forget their passwords. This means more users having to reset their passwords, resulting in increased administrative burden and inconvenience to users. In the most secure case, the procedure to reset passwords should be as secure as the mechanism to create accounts; I have worked in places where this required a physical visit and presentation of ID. In most cases, the password reset procedure is as simple as a phone call.

11.9.8 Security Questions

In many cases, this burden is too high or impractical, particularly for web sites. In these situations, the user is often asked to select certain security questions which will allow them to reset their password. The traditional method was to require their mother’s maiden name, but nowadays there are wide variety of questions, many of which are (unfortunately) easy to guess, especially for people who know the user in question personally.

11.9.9 Disabling Root Logins

Some security pundits have suggested that you disable logins for root to avoid someone getting in as the administrator; then one must guess the user name of a specific administrator as well, but this really isn’t all that hard, and makes it impossible to, say, rsync an entire file system over ssh (since one cannot log in directly as root, one cannot directly access files as root).
I find it simpler and safer to disallow password-based authentication altogether, wherever possible.
For remote administration, let’s compare the scenario they are suggesting (reusable passphrases but no direct root logins), with my scenario (cryptographic logins, direct root access). My scenario has the following obvious attack vectors:
  • The adversary takes control of the system you’re sitting at, where your ssh key is stored, in which case he could impersonate you anyway (he may have to wait for you to log in to sniff the reusable passphrase, or to hijack an existing connection, but I think it’s not worth worrying about the details; if they have root on your console, you’re hosed).
  • The adversary guesses your 4096-bit private RSA key, possibly without access to the public key. In this case, he could probably use the same technique against the encryption used to protect the SSH or IPsec sessions you’re using to communicate over anyway (host keys are often much smaller than 4096-bit), and in the alternate scenario (no direct root logins, but allowing reusable passphrases) he would get access to the reusable passphrases (and all other communication).
By contrast, their scenario has the same vulnerabilities, plus:
  • Someone guesses the login and password. Login names are not secrets, and never have been treated as secrets (e.g. they’re often in your email address). They may not even be encrypted in the SSH login procedure. Passwords may be something guessable to your adversary but not you; for example, a word in a dictionary you don’t have, an “alternative spelling” that you didn’t think of, or perhaps the user uses the same passphrase to access a web site (perhaps even via unencrypted HTTP).

11.9.10 Eliminating Reusable Authenticators

Thus, it is undesirable to use re-usable authentication over the network. However, these other kinds of authentication present difficulties:
  • Encrypted storage; this is like using encryption to communicate with your future self. Obviously, you must reuse the same key, or somehow re-encrypt the disk. One could, theoretically, disallow direct access to the key used to encrypt the storage, and re-encrypt it each time with a different passphrase, but to protect it from the administrator you’d need to use some sort of hardware decryption device, and to protect it against someone with physical access you’d need tamper-resistant hardware (e.g. TPM).
  • Authenticating to the system you’re sitting at; even then, one could use S/Key or another system for one-time authenticators written down and stored in your wallet, combined with a memorized passphrase.

11.10 Biometrics

Entire books have been written about biometrics, and I am not an expert in the field. Thus, this section is just a stub, waiting for me to flesh it out.

11.11 Authentication Issues: When, What

In Unix, a console login or remote login (via e.g., SSH) requires authentication only once, and then all commands issued in the session are executed without additional authentication. This is the traditional authentication scheme used by most multi-user systems today.
There historically was a system whereby rsh (and later, SSH) could be configured to trust other systems; the current system trusted the other system to only make such a request on behalf of a duly authorized user, and presumably both systems were in the same administrative domain. However, this turned out to be problematic; the adversary or his software could easily exploit these transitive trust relationships to seize control of multiple, sometimes all, systems in the administrative domain. For this reason, this system authentication method is rarely used, however, it is implicitly the model used in network security. A somewhat weaker model is used by firewalls, which only use the IP address (somewhere you claim to be reachable) as the authenticator.
Changing a user’s password is a significant change; it can lock someone out of their account unless and until the person can convince an administrator to reset it. For this reason, the passwd command (and it alone) required entering the old password before you could change it; this prevented someone from sitting down at a logged-in terminal and locking the person out of their own account (and potentially allowing the adversary in from another, safer, location).
As another example, there is also a relatively standard way to perform actions as the root, or most privileged user called sudo. The sudo program allows administrators to operate as normal users most of the time, which reduces the risk of accidentally issuing a privileged command, which is a good thing. In this sense, it is similar to role-based access control (see 12.3↓). However, the relevant point here is that it started by requiring your account password with each command issued through it. In this way, it prevented accidental issuance of commands by oneself, but also prevented someone from using an administrator’s console to issue a command. This is authentication of an individual transaction or command. Later this was found to be too inconvenient, and so the authentication was cached for a short period of time so that one could issue multiple commands at once while only being prompted for a password once.
This suggests that Unix has evolved a rather hybrid authentication scheme over the years; it authenticates the session only for most things, but in certain cases it authenticates individual commands.
So when designing a system, it seems useful to ask ourselves when we want to authenticate; per session, or per transaction. It is also worth asking what is being authenticated; remote systems, transactions, or people.

11.12 Remote Attestation

A concept in network security involves knowing that the remote system is a particular program or piece of hardware is called remote attestation. When I connect securely over the network to a machine I believe I have full privileges on, how do I know I’m actually talking to the machine, and not a similar system controlled by the adversary? This is usually attempted by hiding an encryption key in some tamper-proof part of the system, but is vulnerable to all kinds of disclosure and side-channel attacks, especially if the owner of the remote system is the adversary.
The most successful example seems to be the satellite television industry, where they embed cryptographic and software secrets in an inexpensive smart card with restricted availability, and change them frequently enough that the resources required to reverse engineer each new card exceeds the cost of the data it is protecting. In the satellite TV industry, there’s something they call ECMs (electronic counter-measures), which are program updates of the form “look at memory location 0xFC, and if it’s not 0xFA, then HCF” (Halt and Catch Fire). The obvious crack is to simply remove that part of the code, but then you will trigger another check that looks at the code for the first check, and so on.
The sorts of non-cryptographic self-checks they request the card to do, such as computing a checksum (such as a CRC) over some memory locations, are similar to the sorts of protections against reverse engineering, where the program computes a checksum to detect modifications to itself.

11.13 Advanced Authentication Tools

12 Authorization – Access Control

12.1 Privilege Escalation

Ideally, all services would be impossible to abuse. Since this is difficult or impossible, we often restrict access to them, to limit the potential pool of adversaries. Of course, if some users can do some things and others can’t, this creates the opportunity for the adversary to perform an unauthorized action, but that’s often unavoidable. For example, you probably want to be able to do things to your computer, like reformat it and install a new operating system, that you wouldn’t want others to do. You will want your employees to do things an anonymous Internet user cannot (see 4.3↑). Thus, many adversaries want to escalate their privileges to that of some more powerful user, possibly you. Generally, privilege escalation attacks refer to techniques that require some level of access above that of an anonymous remote system, but grant an even higher level of access, bypassing access controls.
They can come in horizontal (user becomes another user) or vertical (normal user becomes root or Administrator) escalations.

12.2 Physical Access Control

These include locks. I like Medeco, but none are perfect. It’s easy to find guides to lock picking:

12.3 Operating System Access Control

12.3.1 Discretionary Access Control

Discretionary Access Control, or DAC (http://en.wikipedia.org/wiki/Discretionary_access_control) is up to the end-user. They can choose to let other people write (or read, e