What is file expansion? - linux

found this and could understand
Example: Windows 8.3 filename expansion “c:\program files” be -
comes “C:\PROGRA~1”
i tryed to navigate to the two paths and they worked both
anyone could make it clear

This is a holdover from the days of Windows 95, which revamped the filesystem FAT to FAT32, which enabled long filenames, and was a part of the selling point of the system itself.
At the time, there was still, old DOS packages, old Win 3.1 packages, that relied on the old filename convention 8.3 that is, 8 characters with 3 character for extension.
Windows 95 incorporated the API, to convert automatically in both directions, whilst maintaining compatibility with the existing FAT system, even after using the convert FAT utility. This was to ensure that no breakage of the files occurred, in the context of the old applications on it.
That API is still available to this day.
GetShortPathName with the long filename as parameter, returns the short 8.3, with abbreviation in the form of ~.
GetLongPathName with the 8.3 filename as parameter, returns the long filename.
Source found in MSDN

In ye olde days, the FAT file system used by MS-DOG only supported eight character file names.
When MS switched to the FAT32 file system that used longer names (and later to the NTFS, this created migration issues. There were old systems that only supported 8+3 file names that would be accessing FAT32 disks over a network and there would be old software that only worked with 8+3 file names.
The solution MS came up with was to create short path names that used ~ and numbers to create unique 8+3 aliases for longer file names.
If you were on an old system and accessing a networked disk (or even using DOS commands on a FAT32 local disk early on):
c:\program files
became
C:\PROGRA~1
If you had
c:\program settings
That might come out as
C:\PROGRA~2
In short, this was then a system for creating unique 8+3 file names that mapped to longer file names so that they could be used with legacy systems and software.

Related

glibc function fails on large file

I have a utility I wrote years ago in C++ which takes all the files in all the subdirectories of a given directory and moves them to new numbered subdirectories based on a count of the files. It has worked without error for several years.
Yesterday it failed for the first time. It always fails on a 2.7Gig video file, perhaps the largest this utility has ever encountered. The file itself is not corrupt. It will play in a video player. I can move it with command line or file manager apps without a problem.
I use ntfw() to walk the directory subtree. On this file, ntfw() returns an error code of -1 on encountering the file, before calling my callback function. Since (I thought) the code is only dealing with filenames and not actually opening or reading the file, I don't understand why the file size should be an issue.
The number of open file descriptors is not the problem. Nor the number of files. It was in a subtree of over 5,000 files, but when moving it to one of only 50 it still fails, while the original subtree is processed without error. File permissions are not the problem. This file has the same as all the others. This includes ACL permissions.
The question is: Is file size the issue? Why?
The file system is ext4.
ldd --version /usr/lib/i386-linux-gnu/libc.so
ldd (Ubuntu GLIBC 2.27-3ubuntu1.4) 2.27
Linux version 4.15.0-161-generic (buildd#lgw01-amd64-050)
(gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04))
#169-Ubuntu SMP Fri Oct 15 13:39:59 UTC 2021
As you're using a 32-bit application, in order to work properly with files larger than 2 GB you should compile with -D_FILE_OFFSET_BITS=64 in order to use 64-bit file handling syscalls and types.
In particular, nftw() calls stat() which fails with EOVERFLOW if the size of the file exceeds 2 GB: https://man7.org/linux/man-pages/man2/stat.2.html
Also, for using mmap() (which it seems you're not using, but just in case as a comment was mentioning it), you can't allocate all of 4 GB, some of the address space is reserved for the kernel (typically 1 GB on Linux). Then some space is used by the stack(s), shared libraries etc. Maybe you'll be able to map 2 GB at a time, if you're lucky.

Okay to call a file core.foo?

On Unix-like operating systems, one should never call a file core because it might be overwritten by a core dump, so that name is de facto reserved for use by the operating system.
But what about with an extension, like core.c? It seems to me this should be perfectly okay; core and core.c are two distinct names.
Is the above correct, and names like core.foo are okay? Or am I missing anything?
On Unix-like operating systems,
Ie. on a system conforming to the single UNIX Specification (or just to POSIX). Or on a system belonging to the unix family.
one should never call a file core because it might be overwritten by a core dump, so that name is de facto reserved for use by the operating system.
No. I do not think there is a specification that specifies that core dumps are going into a file named "core". It's all "implementation defined", from posix:
3.117 Core File
A file of unspecified format that may be generated when a process terminates abnormally.
If coredump is created, where is it created, how and what the contents are, it's all up to implementation. [Freebsd creates a file named executable_name.core for ages. So not "on unix-like operating systems". Your sentence could be made valid by changing the beginning to:
On a linux system with kernel version lower then 2.6 or 2.4.21 one should never call a file "core", because...
Kernel 2.6 and 2.4.21 is more then 15 years old. Newer kernels have /proc/sys/kernel/core_pattern that allow specifying the filename and location and even a process to run on a coredump. So if you are working with such archaic kernel version lower then 2.6 or 2.4.21 (or the content of /proc/sys/kernel/core_pattern is naively set to core), then yes, you should be careful when creating a file named "core".
In today's world, that behavior doesn't matter at all, on most linux distributions systemd-coredump takes care of that and creates coredumps in /var/lib/systemd/coredump.
But what about with an extension, like core.c?
It's ok - core and core.c are different filenames. Dot is not a special character, it's just a character like anything else, core.c differs from core as much as core12 or coreAB does...
Is the above correct,
Yes.
and names like core.foo are okay?
Are okay.
Or am I missing anything?
No idea, but if you are, I surely hope you will find it.

In node.js how can I know whether fs.stat() will return usable crtime and/or birthtime fields for a given file/path/volume/fs?

I recently learned that different OSes and even different filesystems under the same OS support different subsets of the timestamps returned by lstat.
The Stats object returned gives us four times, each in two different flavours.
js Date objects:
atime: the last time this file was accessed expressed in milliseconds since the POSIX Epoch
mtime: the last time this file was modified ...
ctime: the last time the file status was changed ...
birthtime: the creation time of this file
(atimeMs, mtimeMs, ctimeMs, and birthtimeMs are js Date object versions of each of the above)
"Modified" means the file's contents were changed by being written to etc.
"Changed" means the file's metadata such as owners and permissions was changed.
Linux has traditionally never supported the concept of birth time, but as more newer filesystems did support it, it has recently had support added to hopefully all relevant layers of the Linux stack if I have read correctly.
But Windows and Mac both do support birth time as do their native filesystems.
Windows on the other hand did not traditionally support a concept of file change separate from file modification. But to comply to POSIX it added support at the API level and to NTFS. (It doesn't seem to be exposed anywhere in the GUI or commandline though). FAT fs does not support it.
When I call lstat on a file on Windows on an NTFS drive the results for ctime look good. When I call it on a file on a FAT drive, ctime contains junk. (In my case it's always 2076-11-29T08:54:34.955Z for every file.)
I don't know if this is a bug.
I don't know what birthtime returns on Linux on filesystems that don't support it. Hopefully null or undefined but perhaps also garbage. I also don't know what Linux or Mac return in ctime for files on FAT volumes.
So is there a way in Node to get info on which of these features are supported for a given file/path/fs?

Copy SQLite database to another path

I am creating a sqlite database in temp folder. Now I want to copy that file to another folder. Is there any sqlite command to rename the sqlite database file?
I tried using rename function in c++ but it returns error 18. Error no 18 means: "The directory containing the name newname must be on the same file system as the file (as indicated by the name oldname)".
Can someone suggest a better way to do this.
Use a temporary directory on the correct filesystem!
First, sqlite database is just a file. It can be moved or copied around whatever you wish, provided that:
It was correctly closed last time, so there would be no stuff to roll-back in the journal
If it uses write-ahead log type of journal, it is fully checkpointed.
So moving it as a file is correct. Now there are two ways to move a file:
Using the rename system call.
By copying the file and deleting the old one.
The former has many advantages: it can't leave partially written file around, it can't leave both files around , it is very fast and if you use it to rename over old file, there is no period where the target name wouldn't exist (the last is POSIX semantics and Windows can do it on NTFS, but not FAT filesystem). It also has one important disadvantage: it only works within the filesystem. So you have to:
If you are renaming it to ensure that a partially written file is not left behind in case the process fails, you need to use rename and thus have to use a temporary location on the same filesystem. Often this is done by using different name in the same directory instead of using temporary directory.
If you are renaming it because the destination might be slow, e.g. because you want to put it on a network share, you obviously want to use temporary directory on different filesystem. That means you have to read the file and write it under the new name. There is no function for this in the standard C or C++ library, but many libraries will have one (and high-level languages like python will have one and you can always just execute /bin/mv to do it for you).
Now I want to copy that file to another folder. Is there any sqlite command to rename the sqlite database file?
Close the database. Copy the database to the new path using the shell.
Also see Distinctive Features Of SQLite:
Stable Cross-Platform Database File
The SQLite file format is cross-platform. A database file written on
one machine can be copied to and used on a different machine with a
different architecture. Big-endian or little-endian, 32-bit or 64-bit
does not matter. All machines use the same file format. Furthermore,
the developers have pledged to keep the file format stable and
backwards compatible, so newer versions of SQLite can read and write
older database files.
Most other SQL database engines require you to dump and restore the
database when moving from one platform to another and often when
upgrading to a newer version of the software.

Are Linux's timezone files always in /usr/share/zoneinfo?

I'm writing a program that needs to be able to read in the time zone files on Linux. And that means that I need to be able to consistently find them across distros. As far as I know, they are always located in /usr/share/zoneinfo. The question is, are they in fact always located in /usr/share/zoneinfo? Or are there distros which put them elsewhere? And if so, where do they put them?
A quote from tzset(3):
The system timezone directory used depends on the (g)libc version.
Libc4 and libc5 use /usr/lib/zoneinfo,
and, since libc-5.4.6,
when this doesn't work, will try /usr/share/zoneinfo.
Glibc2 will use the environment
variable TZDIR, when that exists. Its
default depends on how it was installed, but normally is
/usr/share/zoneinfo.
Note, however, that nothing prevents some perverse distro from patching libc and placing the files wherever they want.
The public-domain time zone database contains the code and data to handle time zones on Linux.
The public-domain time zone database
contains code and data that represent
the history of local time for many
representative locations around the
globe. It is updated periodically to
reflect changes made by political
bodies to time zone boundaries, UTC
offsets, and daylight-saving rules.
This database (often called tz or
zoneinfo) is used by several
implementations, including the GNU C
Library used in GNU/Linux, FreeBSD,
NetBSD, OpenBSD, Cygwin, DJGPP, AIX,
Mac OS X, OpenVMS, Oracle Database,
Solaris, Tru64, and UnixWare.
That covers a lot of system but I can only agree with Roman that nobody can be prevented from creating a distribution that differs for whatever reasons. The existence and location of a zonezinfo file is not covered by any official standard as far as I know. The standards (e.g. POSIX and XPG4) only establish the API.

Resources