Rust reserved names for modules [duplicate] - rust

I'm not asking about general syntactic rules for file names. I mean gotchas that jump out of nowhere and bite you. For example, trying to name a file "COM<n>" on Windows?

From: http://www.grouplogic.com/knowledge/index.cfm/fuseaction/view_Info/docID/111.
The following characters are invalid as file or folder names on Windows using NTFS: / ? < > \ : * | " and any character you can type with the Ctrl key.
In addition to the above illegal characters the caret ^ is also not permitted under Windows Operating Systems using the FAT file system.
Under Windows using the FAT file system file and folder names may be up to 255 characters long.
Under Windows using the NTFS file system file and folder names may be up to 256 characters long.
Under Window the length of a full path under both systems is 260 characters.
In addition to these characters, the following conventions are also illegal:
Placing a space at the end of the name
Placing a period at the end of the name
The following file names are also reserved under Windows:
aux,
com1,
com2,
...
com9,
lpt1,
lpt2,
...
lpt9,
con,
nul,
prn

Full description of legal and illegal filenames on Windows: http://msdn.microsoft.com/en-us/library/aa365247.aspx

A tricky Unix gotcha when you don't know:
Files which start with - or -- are legal but a pain in the butt to work with, as many command line tools think you are providing options to them.
Many of those tools have a special marker "--" to signal the end of the options:
gzip -9vf -- -mydashedfilename

As others have said, device names like COM1 are not possible as filenames under Windows because they are reserved devices.
However, there is an escape method to create and access files with these reserved names, for example, this command will redirect the output of the ver command into a file called COM1:
ver > "\\?\C:\Users\username\COM1"
Now you will have a file called COM1 that 99% of programs won't be able to open, and will probably freeze if you try to access.
Here's the Microsoft article that explains how this "file namespace" works. Basically it tells Windows not to do any string processing on the text and to pass it straight through to the filesystem. This trick can also be used to work with paths longer than 260 characters.

The boost::filesystem Portability Guide has a lot of good info.

Well, for MSDOS/Windows, NUL, PRN, LPT<n> and CON. They even cause problems if used with an extension: "NUL.TXT"

Unless you're touching special directories, the only illegal names on Linux are '.' and '..'. Any other name is possible, although accessing some of them from the shell requires using escape sequences.
EDIT: As Vinko Vrsalovic said, files starting with '-' and '--' are a pain from the shell, since those character sequences are interpreted by the application, not the shell.

Related

Is there an alternative for the slash in a path?

I have an application which correctly escapes slashes ("/) in file names to avoid path traversal attacks.
The secret file has this path:
/tmp/secret.txt
I want to access this file by uploading a file with a special crafted file name (something like \/tmp\/secret.txt)
Is there any alternative syntax without the slashes which I can use so that Linux will read this file?
(I'm aware of URL encoding but as the escaping is done in the backend this has no use for me.)
No. The / is not allowed in a filename, no matter if it's escaped as \/ or not.
It is one out of only two characters that are not allowed in filenames, the other being \0.
This means that you obviously could use _tmp_secret.txt or -tmp-secret.txt, or replace the / in the path with any other character that you wish, to create a filename with a path "encoded into it". But in doing so, you can not encode pathnames that includes the chosen delimiter character in one or several of its path components and expect to decode it into the original pathname.
This is, by the way, how OpenBSD's ports system encodes filenames for patches to software. In (for example) /usr/ports/shells/fish/patches we find files with names like
patch-share_tools_create_manpage_completions_py
which comes from the pathname of a particular file in the fish shell source distribution (probably share/tools/create_manpage_completions.py). These pathnames are however never parsed, and the encoding is only there to create unique and somewhat intelligible filenames for the patches themselves. The real paths are included in the patch files.

How to escape colon (:) in $PATH on UNIX?

I need to parse the $PATH environment variable in my application.
So I was wondering what escape characters would be valid in $PATH.
I created a test directory called /bin:d and created a test script called funny inside it. It runs if I call it with an absolute path.
I just can't figure out how to escape : in $PATH I tried escaping the colon with \ and wrapping it into single ' and double " quotes. But always when I run which funny it can't find it.
I'm running CentOS 6.
This is impossible according to the POSIX standard. This is not a function of a specific shell, PATH handling is done within the execvp function in the C library. There is no provision for any kind of quoting.
This is the reason why including certain characters (anything not in the "portable filename character set" - colon is specifically called out as an example.) is strongly recommended against.
From SUSv7:
Since <colon> is a separator in this context, directory names that might be used in PATH should not include a <colon> character.
See also source of GLIBC execvp. We can see it uses the strchrnul and memcpy functions for processing the PATH components, with absolutely no provision for skipping over or unescaping any kind of escape character.
Looking at the function
extract_colon_unit
it seems to me that this is impossible. The : is unconditionally and
inescapably used as the path separator.
Well, this is valid at least for bash. Other shells may vary.
You could try mounting it
mount /bin:d /bind
PATH=/bind
According to http://tldp.org/LDP/abs/html/special-chars.html single quotes should preserve all special characters, so without trying it, I would think that '/bin:d' would work (with)in $PATH.

How to enable linux support double backslashes "\\" as the path delimiter

Assume we have a file /root/file.ini.
In Ubuntu's shell, we can show the content with this command,
less /root\\file.ini
However, in debian's shell, the command below will report that the file does not exist.
Does anybody happen to know how to make linux support "\\" as a path delimiter? I need to solve it because we have a software, which tries to access a file using "\\". It works fine in ubuntu, but not in debian.
Thanks
Linux cannot support \ as a path delimiter (though perhaps it might be able to with substantial changes to the kernel). This is because \ is a valid file name character. In fact the only characters not allowed as part of a file name are / and \0 (the null character).
If this seems to be working under ubuntu, then I would check for the existence of a file called root\file.ini in /
I believe you will probably find it easier to make your program platform independent.
I found this forum post which states / is a platform independent path delimiter in ANSI C any that file operations will automatically convert / to actual path delimiter used on the host OS.
have you tried "\\\\" (4 backslashes) first and third one for escaping and second and the last one to rule them all?

Does GCC support command files

MSVC compilers support command files which are used to pass command line options. This is primarily due to the restriction on the size of the command line parameters that can be passed to the CreateProcess call.
This is less of an issue on Linux systems but when executing cygwin ports of Unix applications, such as gcc, the same limits apply.
Therefore, does anyone know if gcc/g++ also support some type of command file?
Sure!
#file
Read command-line options from file. The options read are inserted
in place of the original #file option. If file does not exist, or
cannot be read, then the option will be treated literally, and not
removed.
Options in file are separated by whitespace. A whitespace
character may be included in an option by surrounding the entire
option in either single or double quotes. Any character (including
a backslash) may be included by prefixing the character to be
included with a backslash. The file may itself contain additional
#file options; any such options will be processed recursively.
You can also jury-rig this type of thing with xargs, if your platform has it.

Are /../ and /./ the only file system symbolic links?

I want to check that a file system path is valid and safe to use relative to another path. So I want to know if there are any other special characters like /../ and /./ which might cause a path to actually point somewhere else.
If that is all I have to worry about then a quick replace of those chars followed by something like this to check for any other bad filesystem chars should work right?
[^a-z0-9\.\-_]
(On windows stuff like C:\ would also have to be allowed)
The use case is that I have a folder which site administrators can create directories in and I want to FORCE them to only create directories in that folder. In other words, no being sneaky with ...path/uploads/../../../var/otherfolder/ if you know what I mean ;)
Which language are you using?
In PHP, for example, you can get the realpath of any string and then compare it to a base directory. If you find your base directoy is a prefix of the realpath, then you're good to go.
Although that's only for PHP, you should be able to find a similar approach in other languages.
There are several oddities on Windows/DOS. Opening any of these will both read and write to unexpected places. I havnt tried how .NET handles these, but I presume that you would get some kind of security exceptions.
CON Console. Reads from keyboard, writes to screen.
"COPY CON temp.txt", end input with ctrl-z.
PRN Printer. (Defaults to LPT1?)
LPTn Parallell ports.
AUX "Auxiliary device." Have never seen anyone use this myself.
COMn Serial ports.
NUL /dev/null
For resolving paths, ., and .., (and in most cases, // for Unix and \\ for Windows) are the main things you really need to worry about in terms of resolving paths. From RFC 3986, this is the algorithm for resolving relative paths in URIs. For the most part, it also applies to file system paths.
An algorithm, remove_dot_segments:
The input buffer is initialized with the now-appended path
components and the output buffer is initialized to the empty
string.
While the input buffer is not empty, loop as follows:
If the input buffer begins with a prefix of "../" or "./",
then remove that prefix from the input buffer; otherwise,
If the input buffer begins with a prefix of "/./" or "/.",
where "." is a complete path segment, then replace that
prefix with "/" in the input buffer; otherwise,
If the input buffer begins with a prefix of "/../" or "/..",
where ".." is a complete path segment, then replace that
prefix with "/" in the input buffer and remove the last
segment and its preceding "/" (if any) from the output
buffer; otherwise,
If the input buffer consists only of "." or "..", then remove
that from the input buffer; otherwise,
Move the first path segment in the input buffer to the end of
the output buffer, including the initial "/" character (if
any) and any subsequent characters up to, but not including,
the next "/" character or the end of the input buffer.
Finally, the output buffer is returned as the result of
remove_dot_segments.
Example run:
STEP OUTPUT BUFFER INPUT BUFFER
1 : /a/b/c/./../../g
2E: /a /b/c/./../../g
2E: /a/b /c/./../../g
2E: /a/b/c /./../../g
2B: /a/b/c /../../g
2C: /a/b /../g
2C: /a /g
2E: /a/g
STEP OUTPUT BUFFER INPUT BUFFER
1 : mid/content=5/../6
2E: mid /content=5/../6
2E: mid/content=5 /../6
2C: mid /6
2E: mid/6
Don't forget that it's possible to do things like specify more ".." segments than there are parent directories. So if you're trying to resolve a path, you could end up trying to resolve beyond /, or in the case of Windows, C:\.
The answer depends on the filesystem used. It's different on Windows, different on *nix.
For example, on Windows-based desktop platforms, invalid path characters might include quote ("), less than (<), greater than (>), pipe (|), backspace (\b), null (\0), and Unicode characters 16 through 18 and 20 through 25.
I don't know which platform/language are you using, but if you are using .NET you can get list of chars which cannot be in filename by calling Path.GetInvalidFilenameChars and list of chars which cannot be in path by calling Path.GetInvalidPathChars
Unix symbolic links can be tricky, and can even be created to cause pathing loops on some systems. You should lstat() the filename to get the actual inode and devno numbers to see if two pathnames are actually the same file.
Have you considered using something like chroot? You can create something called a "chroot jail" that will prevent people from getting outside it. This is enforced by the OS, so you don't have to write it yourself. Note that this only works on *nix, and on some variants of *nix, it does not have all the security features necessary to make it foolproof (i.e. there are known ways of escaping).
I've already directly answered the question, but as Tom said, what you're trying to do is inherently dangerous. What you should probably do instead is create one directory at a time. Pass it through a regexp validator and don't let them use dot segments at all. Just have a text field in a form for the directory name and a "Make Directory" button. Let them traverse the directory tree to create sub-directories. This way you can be absolutely confident that the files are going where they should.
This has the advantage of working on both Windows and *nix without the need for chroot.
Addenda:
This Regexp will only match illegitimate directory names, assuming that you're accepting directories one at a time:
/^(\.\.?|.*?[^a-zA-Z0-9\. _-]+.*?|^)$/
Valid directory names:
"This is a directory"
".hidden"
"example.com"
"10-28-2009"
Invalid directory names:
""
"."
".."
"../somewhere/else"
"/etc/passwd"
"would:be?rejected!by;OS"

Resources