Unix File Types

In Unix systems, there are 6 file types. Below I will give a very short description of each.

How to find out the type of file in Unix

The first and most obvious way to confirm the type of a particular file is to use the long-format output of ls command, invoked by the -l option:

$ ls -l * 
 -rw-r--r-- 1 greys greys       1024 Mar 29 06:31 text

The very first field of such output is the file type and access permissions field, I’ll cover them in a separate post in the future. For now, just concentrate on the first character in this field. In this particular case, it’s “-“, which means it’s a regular file. For other file types, this character will be different.

Regular file

This is the most common type of a file in Unix. Being a plain collection of bytes with arbitrary data. There’s nothing mysterious about this type. Most of the files you will ever work with are regular.

In long-format output of ls, this type of file is specified by the “-” symbol.

Directory

This is a special type of a file in Unix, which only contains a list of other files (the contents of a directory). You don’t work with directories directly, instead you manage them with standard commands provided with your OS. The whole directory structure of your Unix system is made of such special files with directory content in each of them.

In long-format output of ls, this type of file is specified by the “d” symbol:

$ ls -ld * 
 -rw-r--r-- 1 greys greys	1024 Mar 29 06:31 text
 drwxr-xr-x 2 greys greys	4096 Aug 21 11:00 mydir

Special Device File

This type of files in Unix allows access to various devices known to your system. Literally, almost every device has a special file associated with it. This simplifies the way Unix interacts with different devices – to the OS and most commands each device is still a file, so it can be read from and written to using various commands. Most special device files are owned by root, and regular users cannot create them,

Depending on the way of accessing each device, its special device file can be either a character (shown as “c” in ls output) or a block (shown as “b”) device. One device can have more than one device file associated, and it’s perfectly normal to have both character and block device files for the same device.

Most special device files are character ones, and devices referred by them are called raw devices. The simple reason behind such a name is that by accessing the device via its special device character file, you’re accessing the raw data on the device in a form the device is ready to operate with. For terminal devices, it’s one character at a time. For disk devices though, raw access means reading or writing in whole chunks of data – blocks, which are native to your disk. The most important thing to remember about raw devices is that all the read/write operations to them are direct, immediate and not cached.

Block device file will provide similar access to the same device, only this time the interaction is going to be buffered by the kernel of your Unix OS. Grouping data into logical blocks and caching such blocks in memory allows the kernel to process most I/O requests much more efficiently. No longer does it have to physically access the disk every time a request happens. The data block is read once, and then all the operations to it happen in the cached version of it, with data being synced to the actual device in regular intervals by a special process running in your OS.

Here’s how the different types of special device files look in your ls output:

$ ls -al /dev/loop0 /dev/ttys0
brw-rw---- 1 root disk 7,  0 Sep  7 05:03 /dev/loop0
crw-rw-rw- 1 root tty  3, 48 Sep  7 05:04 /dev/ttys0

Named Pipe

Pipes represent one of simpler forms of Unix interprocess communication. Their purpose is to connect I/O of two Unix processes accessing the pipe. One of the processes uses this pipe for output of data, while another process uses the very same named pipe file for input.

In long-format output of ls, named pipes are marked by the “p” symbol:

$ ls -al /dev/xconsole
prw-r----- 1 root adm 0 Sep 25 08:58 /dev/xconsole

Symbolic Link

This is yet another file type in Unix, used for referencing some other file of the filesystem. Symbolic link contains a text form of the path to the file it references. To an end user, symlink (sort for symbolic link) will appear to have its own name, but when you try reading or writing data to this file, it will instead reference these operations to the file it points to.

In long-format output of ls, symlinks are marked by the “l” symbol (that’s a lower case L). It also show the path to the referenced file:

$ ls -al hosts
lrwxrwxrwx 1 greys www-data 10 Sep 25 09:06 hosts -> /etc/hosts

In this example, a symlink called hosts points to the /etc/hosts file.

Socket

A Unix socket (sometimes also called IPC socket – inter-process communication socket) is a special file which allows for advanced inter-process communication. In essence, it is a stream of data, very similar to network stream (and network sockets), but all the transactions are local to the filesystem.

In long-format output of ls, Unix sockets are marked by “s” symbol:

$ ls -al /dev/log
srw-rw-rw- 1 root root 0 Sep  7 05:04 /dev/log

That’s it. Hope this gave you a better idea of what file types you can find working on your Unix system. I’ll obviously expand relevant topics in the future. Let me know if there’s anything in particular you’d like me to concentrate on!