A hard link to a file is indistinguishable from the original file because it is a reference to the object underlying the original filename. (To be precise: each of the hard links to a file is a reference to the same inode number, where an inode number is an index into the inode table, which contains metadata about all files on a filesystem. See stat(2).) Changes to a file are independent of the name used to reference the file. Hard links may not refer to directories (to prevent the possibility of loops within the filesystem tree, which would confuse many programs) and may not refer to files on different filesystems (because inode numbers are not unique across filesystems).
A symbolic link is a special type of file whose contents are a string that is the pathname of another file, the file to which the link refers. (The contents of a symbolic link can be read using readlink(2).) In other words, a symbolic link is a pointer to another name, and not to an underlying object. For this reason, symbolic links may refer to directories and may cross filesystem boundaries.
There is no requirement that the pathname referred to by a symbolic link should exist. A symbolic link that refers to a pathname that does not exist is said to be a dangling link.
Because a symbolic link and its referenced object coexist in the filesystem name space, confusion can arise in distinguishing between the link itself and the referenced object. On historical systems, commands and system calls adopted their own link-following conventions in a somewhat ad-hoc fashion. Rules for a more uniform approach, as they are implemented on Linux and other systems, are outlined here. It is important that site-local applications also conform to these rules, so that the user interface can be as consistent as possible.
On Linux, the permissions of a symbolic link are not used in any operations; the permissions are always 0777 (read, write, and execute for all user categories), and can't be changed. (Note that there are some "magic" symbolic links in the /proc directory tree---for example, the /proc/[pid]/fd/* files---that have different permissions.)
By default (i.e., if the AT_SYMLINK_FOLLOW flag is not specified), if name_to_handle_at(2) is applied to a symbolic link, it yields a handle for the symbolic link (rather than the file to which it refers). One can then obtain a file descriptor for the symbolic link (rather than the file to which it refers) by specifying the O_PATH flag in a subsequent call to open_by_handle_at(2). Again, that file descriptor can be used in the aforementioned system calls to operate on the symbolic link itself.
There are three separate areas that need to be discussed. They are as follows:
Except as noted below, all system calls follow symbolic links. For example, if there were a symbolic link slink which pointed to a file named afile, the system call open(slink ...) would return a file descriptor referring to the file afile.
Various system calls do not follow links, and operate on the symbolic link itself. They are: lchown(2), lgetxattr(2), llistxattr(2), lremovexattr(2), lsetxattr(2), lstat(2), readlink(2), rename(2), rmdir(2), and unlink(2).
Certain other system calls optionally follow symbolic links. They are: faccessat(2), fchownat(2), fstatat(2), linkat(2), name_to_handle_at(2), open(2), openat(2), open_by_handle_at(2), and utimensat(2); see their manual pages for details. Because remove(3) is an alias for unlink(2), that library function also does not follow symbolic links. When rmdir(2) is applied to a symbolic link, it fails with the error ENOTDIR.
link(2) warrants special discussion. POSIX.1-2001 specifies that link(2) should dereference oldpath if it is a symbolic link. However, Linux does not do this. (By default, Solaris is the same, but the POSIX.1-2001 specified behavior can be obtained with suitable compiler options.) POSIX.1-2008 changed the specification to allow either behavior in an implementation.
Except as noted below, commands follow symbolic links named as command-line arguments. For example, if there were a symbolic link slink which pointed to a file named afile, the command cat slink would display the contents of the file afile.
It is important to realize that this rule includes commands which may optionally traverse file trees; for example, the command chown file is included in this rule, while the command chown -R file, which performs a tree traversal, is not. (The latter is described in the third area, below.)
If it is explicitly intended that the command operate on the symbolic link instead of following the symbolic link---for example, it is desired that chown slink change the ownership of the file that slink is, whether it is a symbolic link or not---the -h option should be used. In the above example, chown root slink would change the ownership of the file referred to by slink, while chown -h root slink would change the ownership of slink itself.
There are some exceptions to this rule:
It is important to realize that the following rules apply equally to symbolic links encountered during the file tree traversal and symbolic links listed as command-line arguments.
The first rule applies to symbolic links that reference files other than directories. Operations that apply to symbolic links are performed on the links themselves, but otherwise the links are ignored.
The command rm -r slink directory will remove slink, as well as any symbolic links encountered in the tree traversal of directory, because symbolic links may be removed. In no case will rm(1) affect the file referred to by slink.
The second rule applies to symbolic links that refer to directories. Symbolic links that refer to directories are never followed by default. This is often referred to as a "physical" walk, as opposed to a "logical" walk (where symbolic links that refer to directories are followed).
Certain conventions are (should be) followed as consistently as possible by commands that perform file tree walks:
For commands that do not by default do file tree traversals, the -H, -L, and -P flags are ignored if the -R flag is not also specified. In addition, you may specify the -H, -L, and -P options more than once; the last one specified determines the command's behavior. This is intended to permit you to alias commands to behave one way or the other, and then override that behavior on the command line.