----------------------------------------------------------------- We have covered the following concepts of hard disk in class Hards disks (interfaces: IDE(2+2) SCSI(master + 7 or 15) heads/platters cylinder groups/tracks/blocks sectors Latency RPMs and (average) rotational latency 5,400-15,000 RPM rl = (1/RPM)(1/60 sec/min)(1/2) Seek times (track to track, average ...) 3-12 ms Actual latency (user/os/controller/bus/controller/disk and back) Transfer rates burst vs sustained non-cache burst max = bytes per track / time per rev caching, seeks, rotational latency etc... bus vs disk bus: scsi-2 10Mb/s, scsi ultra-160 160Mb/s disk: 5 Mb/sec, 30 Mb/sec Controller cacheing (disk and system) Asynchronous requests and Streaming To modify a part of a block, you must read the block, modify it and write it back. This is much slower then writing a complete block. DMA Noise Power Heat ----------------------------------------------------------------- Low level formats Master Boot Record (MBR) Partition Table Start and Stop for each partition File system type fro each partition Boot partition Partitions Boot Block File System HARD DISK ____________________ | MBR | The MBR, the Partition Table, | | locations of Partitons and | Partiton Table | the Boot Block in a partiton |__________________| are operating system independent. | First partition | | | Each Partition is treated as a | | seperate logical unit by an |__________________| operating system. | Second partiton | | | High level formatting of a partition | | and the filesystem on a partition are |__________________| operating system dependent. An ... operating system may support more |__________________| the one type of filesystem. Note you can have multiple buses and multiple disk contollers on a system. Hardware RAID / Software RAID What effects actual disk preformance? ----------------------------------------------------------------- Now we will consider a very simplified UNIX filesystem. Many practical issues are not addressed. Each Local file system is on a single partition Block #_________________ 0 |Boot Block | Machine Code used in system "booting" |_______________| 1 |Super Block | Super Block: information about this filesystem |_______________| type of filesystem 2 |i-node | blockng factor and other filesystem |_______________| type-specific information | ... | root "/" i-node 2 |i-node | number of i-node blocks |_______________| number of free i-node blocks 3 |i-node | link to free i-node list |_______________| number of data blocks | ... | number of free data blocks link to free data block list |_______________| i-node: 1 per file, information n+1 |i-node | file type |_______________| mode n+2 |data block | owner uid, group gid |_______________| dates |data block | link count |_______________| file size | ... | ... pointers to data blocks |_______________| Data Block n+m+1 |data block | bytes in the file (for Regular files) |_______________| Every file has a single inode. (The "name" of a file is not in the inode!). There are actually many copies of the super block, not shown here, scattered at known locations through the partition. The "Blocking Factor" and other filesystem parameters are set when the file system is created. Values of the Blocking factor are small multiples of 4K. Typically supported values are 4K-16K. Other filesystem type-specific information includes frag size and mimimun free space etc... File Types include: Regular, Directory, FIFO (Named Pipe), Socket, Character Special, Character Block, Symbolic Link and sometimes others. Not all File Types will use data blocks. ----------------------------------------------------------------- The statvfs(2) call returns a statvfs structure with generic information about a filesystem. The statvfs structure includes the following members: u_long f_bsize; /* preferred file system block size */ u_long f_frsize; /* fundamental filesystem block size */ fsblkcnt_t f_blocks; /* total # of blocks, in units of f_frsize */ fsblkcnt_t f_bfree; /* total # of free blocks */ fsblkcnt_t f_bavail; /* # of free blocks avail to non-super-user */ fsfilcnt_t f_files; /* total # of file nodes (inodes) */ fsfilcnt_t f_ffree; /* total # of free file nodes */ fsfilcnt_t f_favail; /* # of inodes avail to non-super-user */ u_long f_fsid; /* file system id (dev for now) */ char f_basetype[FSTYPSZ]; /* target fs type name, null-terminated */ u_long f_flag; /* bit mask of flags */ u_long f_namemax; /* maximum file name length */ char f_fstr[32]; /* file system specific string */ u_long f_filler[16]; /* reserved for future expansion */ The following values can be returned in the f_flag field: ST_RDONLY 0x01 /* read-only file system */ ST_NOSUID 0x02 /* does not support setuid/setgid semantics */ ST_NOTRUNC 0x04 /* does not truncate file names longer than NAME_MAX */ ---------------------------------------------------------------- Kernel Data Structures for Open FIles Abstractly, the Kernel has a Process Control Block (PCB) for each user process, which contains all the information the kernel maintains about the process. One part of a PCB is the "file descriptor table". To discuss the file input output (IO) operations two other kernel data structures need to be mentioned, the File Table and the V-node Table. These two tables are not part of the PCB, There is one PCB for each Process, but only one File Table and one V-node Table in the kernel that is "shared" by all processes. PCB _________________ |PID | |---------------| V-node Table |PPID | ___________________ |---------------| --->|v-node info | |UID | | |i-node info | |---------------| | |current file size| |EUID | File Table | |_________________| |---------------| _____________________ | |... | --->|file status flags | | | | | |current file offset| | |---------------| | |v-node ptr --------|---- |file descriptor| | |___________________| |table | | |fd flags ptr | | |-- ----- --- | | ___________________ | 0 -| | --->|v-node info | | 1 -|---- | |i-node info | | 2 -| _____________________ | |current file size| | 3 -|------>|file status flags | | |_________________| | ... -| |current file offset| | | fd_max -| |v-node ptr --------|---- |_______________| |___________________| | | |_______________| fd (file descriptor) flags (only one defined) FD_CLOSEEXEC - on an exec(2) call, close this descriptor automatically file status flags The file status flags have the bits set in the second arguement to the open(2) call, if the open() was successful, ie (O_RDONLY, O_WRONLY, O_RDRW) O_APPEND O_NONBLOCKING O_SYNC O_ASYNC (4.3 + BSD only) current file offset Where to read or write the next byte, for a read(2) or write(2) v-node info (Filesystem type-independent "i-node information") File System independent part of file information. File Type (types above plus: NFS regular, NFS directory, ...) Permissons, SUID/SGID/Sticky, Owner, Group, Dates Link Count i-node info (Filesystem type-dependent "i-node information") File System (File Type) dependent part of file system information Type: Regular local interface id (via a device id) local disk id (via a device id) local partition # (via a device id) local inode # pointers to local data blocks NSF Regular remote server id remote tree id remote file id Directory ... current file size Current number of bytes in the file. ---------------------------------------------------------------- The stat(2) call for an open descriptor returns generic information about the file in a stat structure. The stat structure includes the following members: mode_t st_mode; /* File type and mode */ ino_t st_ino; /* Inode number */ dev_t st_dev; /* ID of device with a dir entry for this file */ dev_t st_rdev; /* ID of device (for char and block special) */ nlink_t st_nlink; /* Number of links */ uid_t st_uid; /* User ID of the file's owner */ gid_t st_gid; /* Group ID of the file's group */ off_t st_size; /* File size in bytes */ time_t st_atime; /* Time of last access */ time_t st_mtime; /* Time of last data modification */ time_t st_ctime; /* Time of last file status change */ long st_blksize; /* Preferred I/O block size */ blkcnt_t st_blocks; /* Number of 512 byte blocks allocated*/ The "ls -l" call makes a stat call for all files in the current directory, looks up strings for st_uid etc... and then formats and prints the results one line per directory entry. ---------------------------------------------------------------- **************************************************************** Now consider what is done with each system call: open(2) read(2) write(2) lseek(2) close(2) unlink(2) chmod(2) chown(2) stat(2) statvfs(2) dup(2) dup2(2) mention pipe(2) Having processes "share" descriptors. Having two processes open the same file. Atomic operations on files ****************************************************************