HiDIOS filesystem

Welcome to the HiDIOS home page. The HiDIOS filesystem is a product of the ACSys PIOUS project.

Contents

This is a paper from PCW'95.

If you want more info then please email hidios@cafe.anu.edu.au.

Overview of the HiDIOS File System

HiDIOS is a parallel filesystem for the AP1000 multicomputer. The major design criteria for the filesystem have been high speed, robustness, and usability (including porting considerations). Factors such as security have had a lower priority.

Each cell in the AP1000 sees the same filename space. This means if each cell creates a file named "foo" then they will be the same file, and may overwrite each other.

The filesystem ignores any part of a filename before (and including) a '@'. This is to allow people to use the same filenames on the fujitsu and the hidios filesystems

Limitations of the HiDIOS File System

At the moment any one task in one cell can't have more than about 1200 files open at once.

The application must be configured to include all the cells that have disks. cconfxy(0,0) will give you the whole machine. The minimal configuration can be obtained using cconfxy(_getcinfo(CAP_NOPT0),1) assuming you have included cap2.c.h.

By default you are placed in /home/xxx where xxx is your username. If you want to use a different home directory then set the environment variable CAPHOME before starting your programs to the directory you want.

Creation of lots of small files will be inefficient as far as space is concerned due to the heuristic for determining initial file sizes, and the contiguous nature of file storage.

What doesn't work

There are probably functions that I've forgotten to add. Let me know when something is missing (or doesn't work)

Some of the ones I know of that are missing are chmod(), chown(), ioctl() and fcntl().

Porting to the HiDIOS File System

  1. Make sure your AP1000HOME environment variable points to /usr/ap1000/hidios and that $AP1000HOME/bin is in your path both during compilation and running. You may find the command "source ~tridge/etc/hidios" useful to set this up if you use the tcsh or csh shell.

  2. All stdio functions and the vast majority of normal file functions are available to programs.

  3. Preferably use gcc to compile, although this is not essential.

  4. You must use a cell configuration that includes all the cells that have disks. cconfxy(0,0) will give you the whole machine. The minimal configuration can be obtained using cconfxy(_getcinfo(CAP_NOPT0),1) assuming you have included cap2.c.h.

  5. You can use existing object code without recompiling, but you must relink to use the filesystem.

HiDIOS and MPI

If you use MPI then some of the above doesn't apply to you. However:

  1. You still need to set your AP1000HOME environment variable so you can use the filesystem utilities like "tlist".

  2. The same IO functions are available, MPI-IO is not supported yet.

  3. Use the normal AP1000 MPI method of compiling. See the MPI docs, or click here.

  4. Make sure you always run with all cells.

  5. Make sure you are using at latest release (1.6) of MPI (check your MPI_ROOT environment variable).

HiDIOS File System Utilities

There is a simple shell called apsh, that can be used to simulate being logged onto the AP1000 in order to perform a variety of standard file operations. Alternatively, a number of very simple utilities have been written to help users. These include: tlist, trename, tmkdir, trmdir tcopyto, tcopyfrom, tcopyall, tcopycap, tcomp and tdelete. These should be in your path if you have the right AP1000HOME environment variable setting.

A few other utilities that can be used to effect speed (buffering) fragmentation and filesystem integrity are tbuffer, trepack and tfsck.

HiDIOS File System Performance

You should expect to get 40Mb/sec or better for very large IO operations. If you are doing lots of small io operations then consider using stdio or the tbuffer() function as the local buffering will gain you a lot of speed.

The filesystem should provide good IO balancing automatically. If only one cell is reading/writing at a time then it should get up to 16Mb/sec.

If you use stdio then you should know that the default buffer size is 1024 bytes, which is very small. I recommend you use setbuffer() to raise this to at least 128K for reasonable throughput.