Wierd permissions bug

TedWalther · August 20, 2011, 01:46:02 PM

I have a directory with the permissions r-x------, owned by myself. I also have some directories that are owned by root.

I have a script that I am running to find duplicate files on my harddrive. Over the years a lot of stuff has been duplicated, cruft has accumulated.

When my script hits the directories owned by root, it crashes. So I run the script with "sudo". Or I log in as root using "su". This fixes the problem. But then we get to my special directory with the permissions r-x------. (directory ".myspecialdir") doesn't yield (). It gives "nil". When I run the script as myself, (directory) gives () as expected.

TedWalther · August 20, 2011, 01:52:49 PM

I looked in nl-filesys.c at the code of the directory function. What is happening is, the opendir function is failing. NewLISP is doing the right thing; now I have to find out why my Operating System isn't letting the root user do whatever he wants to do. Permissions of 500 are enough for me to view the directory directly; they should be sufficient for root too.

The opendir manpage isn't much help, it doesn't discuss permissions. I'll have to read up on POSIX file permissions again.

TedWalther · August 20, 2011, 01:59:44 PM

I don't know how they did it, but it looks like the problem has something to do with the nature of .gvfs, the Gnome Virtual File System directory where all user file systems are mounted.

TedWalther · August 20, 2011, 02:22:51 PM

If anyone else runs into this problem, it turns out it is a design decision at the Linux kernel level which BREAKS one of the oldest Unix assumptions. This page describes a workaround: http://eis.comp.lancs.ac.uk/~carl/blog/2010/01/root-access-to-filesystems-mounted-via-gvfs/">http://eis.comp.lancs.ac.uk/~carl/blog/ ... -via-gvfs/">http://eis.comp.lancs.ac.uk/~carl/blog/2010/01/root-access-to-filesystems-mounted-via-gvfs/

Lutz · August 20, 2011, 09:26:27 PM

After (directory ".myspecialdir") returns nil, perhaps (sys-error) would return anything enlightening.

TedWalther · August 21, 2011, 02:16:34 AM

Lutz, my script is doing an md5 hash on each file. It bombs out on files that are bigger than the amount of RAM on my system. (crypto:md5 (read-file "myfile"))

You have read-file. Is mmap universal enough that you could add "mmap-file" as a builtin function? Like read-file, it would return a string.

What is the internal format of strings in newLISP? Is it a memory address plus a length counter? Thus allowing the 0 byte? If a file was mmapped in, how hard would it be to encapsulate it as a newLISP string? Then what would happen when the string is garbage collected using ORO? Is this better done in the unix/posix module?

Lutz · August 21, 2011, 12:03:58 PM

Strings can contain 0's in newLISP. The string length is stored in the cell->aux field. Most built-in functions in newLISP can handle binary content in strings, if not, the manual mentions it.

On my 1GB RAM MacMini and running OS X 10.6.8, I can read a 2GB file without a problem. When allocating memory sizes bigger than available in RAM, the OS is starting to swap out to disk space. This happens transparently, and newLISP "thinks" it is handling a 2GB string in-memory object. Perhaps this a problem specific to your OS and platform. Of course all operations on virtual memory are a lot slower.

Perhaps you can find functions in the libcrypto.so library which handle files bigger than RAM available and modify the crypto.lsp module.

TedWalther · August 21, 2011, 06:43:09 PM

~~Quote from: Lutz~~Strings can contain 0's in newLISP. The string length is stored in the cell->aux field. Most built-in functions in newLISP can handle binary content in strings, if not, the manual mentions it.

On my 1GB RAM MacMini and running OS X 10.6.8, I can read a 2GB file without a problem. When allocating memory sizes bigger than available in RAM, the OS is starting to swap out to disk space. This happens transparently, and newLISP "thinks" it is handling a 2GB string in-memory object. Perhaps this a problem specific to your OS and platform. Of course all operations on virtual memory are a lot slower.

Perhaps you can find functions in the libcrypto.so library which handle files bigger than RAM available and modify the crypto.lsp module.

Well... some of the files I'm dealing with are 60G in size. I only have 8G of RAM.

I think being able to mmap stuff would be a lot more convenient. It lets the OS decide when to read pages in. It uses the file itself as the swap backing store. This could save a lot of memory usage and prevent dipping into swap. I have 8G of ram; I don't use swap. It would be too painfully slow. Using swap slows everything down; using mmap'ed memory doesn't.

So, in theory, if I have a file size, I can mmap the file, and only need the size of the file, plus the starting offset of the memory map, to create an appropriate newlisp string?

The issues involved would be, when the string is garbage collected, the mmap would have to be released so the changes are written to disk. Which the OS does on its own anyway, but we don't want file descriptor handles piling up.

Yes, there is an API for reading a file a bit at a time, and could be hacked into the crypto module. But I find that Windows does have its own equivalent of mmap. This would be nice functionality to have, being able to treat a file on disk as a string, without having to actually read it in. The OS would handle all the swap-in/swap-out details. And of course Mac OS/X supports mmap, and every other POSIX OS.

newLISP Fan Club

News:

Wierd permissions bug

TedWalther

TedWalther

TedWalther

TedWalther

Lutz

TedWalther

Lutz

TedWalther