I have a delphi (Win32) web application that can run either as a CGI app, ISAPI or Apache DLL. I want to be able to generate a unique filename prefix (unique for all current requests at a given moment), and figure that the best way to do this would be to use processID (to handle CGI mode) as well as threadID (to handle dll mode).
How would I get a unique Process ID and Thread ID in Delphi?
Will these be unique in a Multi-Core/Multi-Processor situation (on a single webserver machine)?
Edit: please note that I was advised against this approach, and thus the accepted answer uses a different method to generate temporary filenames
you have many good ideas presented here.
Does it also create an empty file to "get a lock on" the name?
no; i believe we rely on Windows to ensure the same temp file name is never given twice on the same computer since boot time.
is there any chance of a clash if there is a split second delay between generating the name and creating the file (if I need to create the file myself).
no; that'd be a pretty bad thing.
here's a routine i've been using for getting a temp file.
function GetTemporaryFileName:string;
var
Path, FileName: array[0..MAX_PATH] of Char;
begin
Win32Check(GetTempPath(MAX_PATH, Path) <> 0);
Win32Check(GetTempFileName(Path, '~EX', 0, FileName) <> 0);
Result:=String(Filename);
end;
you could instead use FileGetTempName( ) from JclFileUtils.pas in JCL.
Windows provides functionality for creating guaranteed unique file names. No need for creating your own:
Here's a Delphi wrapper around that functionality:
function CreateTempFileName(aPrefix: string): string;
var
Buf: array[0..MAX_PATH] of Char;
Temp: array[0..MAX_PATH] of Char;
begin
GetTempPath(MAX_PATH, Buf);
if GetTempFilename(Buf, PChar(aPrefix), 0, Temp) = 0 then
begin
raise Exception.CreateFmt(sWin32Error, [GetLastError, SysErrorMessage(GetLastError)]);
end;
Result := string(Temp);
end;
Could you not use a GUID instead?
Edit: Should have said first time around, check out the following two functions
CreateGuid
GuidToString
Process IDs are not guaranteed to be unique on windows. They are certainly unique for the life of the process, but once a process dies its id can be immediately reused. I am not certain about ThreadIDs. If these are temporary files you could use something equivalent to tmpfile or tmpnam (C functions, but I assume Delphi has an equivalent).
As Jamie posted a GUID may be better.
1) How to get a unique Process ID & ThreadID in Delphi:
Answer:
NOTE: Ensure to add 'windows' to your uses clause in the implementation section
NOTE: Cardinals are unsigned 32-bit integers ranging from 0 to 4294967295
implementation
uses Windows;
procedure MySolution();
var
myThreadID:Cardinal;
myProcessID:Cardinal;
begin
myThreadID := windows.GetCurrentThreadID;
myProcessID := windows.GetCurrentProcessId;
end;
2) Will these be unique in a Multi-Core/Multi-Processor situation (on a single webserver machine)?
Answer: Yes.
The process identifier is valid from
the time the process is created until
the process has been terminated and is
unique throughout the system. (Not
unique to processor)
Until the thread terminates, the
thread identifier uniquely identifies
the thread throughout the system.
(Again, system wide, not unique to
processor)
Better than either of of those options, you should be using the system function _tempnam. It returns a random file name in the directory for a file that does not exist. If you want to, you can supply a prefix to _tempnam so that the file you create is recognizably yours. If you are providing a unique prefix, there is shouldn't be any worry about someone opening your file. There is another solution, however.
_tempnam is only good if you want to put the file into an arbitrary directory. If you don't care that the directory is the system temporary directory, use tempfile_s instead. It will also create the file for you, so no worry about race conditions... Errors will only occur if you try to open more temp files than the system can handle. The big downside to tempfile_s is that the file will disappear once you fclose it.
EDIT: I've gotten a downvote because this is a C function. You have access to the C runtime by importing them into delphi. Have a look at some examples with msvcrt.dll here.
function _tempnam(const Dir: PChar, const Prefix: PChar): PChar; cdecl;
external 'msvcrt.dll' name '_tempnam';
Others all gave you a good and reasonable ideas, but still - if you're using files for temporary storage and if those files will always be created first (it doesn't matter if there is a leftover file with a same name already on the disk as you'll overwrite it anyway) then processid_threadid approach is completely valid.
Use GetCurrentProcessID and GetCurrentThreadID Win32 calls to access those two IDs.
Related
While researching this question I came across the fact that in POSIX (and Linux) there simply is not a truncateat system call.
Certain system calls like for instance unlink have an equivalent alternative method with an added at suffix at the end of their names, i.e. unlinkat. The difference between those methods is that the variations with the at suffix accept an additional argument, a file descriptor pointing to a directory. Therefore, a relative path passed into unlinkat is not relative to the current working directory but instead relative to the provided file descriptor (an open directory). This is really useful under certain circumstances.
Looking at truncate, there only is ftruncate next to it. truncate works on paths - absolute or relative to the current working directory. ftruncate directly works on an open file handle - without any path being specified. There is no truncateat.
A lot of libraries (various "alternative" C-libraries) do what I did and mimic tuncateat by using an openat-ftruncate-close-sequence. This works, in most cases, except ...
I ran into the following issue. It took me months to figure out what was happening. Tested on Linux, different 3.X and 4.X kernels. Imagine two processes (not threads):
Process "A"
Process "B"
Now imagine the following sequence of events (pseudo code):
A: fd = open(path = 'filename', mode = write)
A: ftruncate(fd, 100)
A: write(fd, 'abc')
B: truncate('filename', 200)
A: write(fd, 'def')
A: close(fd)
The above works just fine. Just after process "A" has the file opened, set its size to 100 and written some stuff into it, process "B" re-sets its size to 200. Then process "A" continues. At the end, the file has a size of 200 and contains "abcdef" at its beginning followed by zero-bytes.
Now, let's try and mimic something like truncateat:
A: fd_a = open(path = 'filename', mode = write)
A: ftruncate(fd_a, 100)
A: write(fd_a, 'abc')
B: fd_b = openat(dirfd = X, path = 'filename', mode = write | truncate)
B: ftruncate(fd_b, 200)
B: close(fd_b)
A: write(fd_a, 'def')
A: close(fd_a)
My file has a length of 200, ok. It starts with three zero-bytes, not ok, then the "def", then then again zero-bytes. I have just lost the first write from process "A" while the "def" technically landed at the correct position (three bytes in, as if I had called seek(fd_a, 3) before writing it).
I can work with the first sequence of operations just fine. But in my use case, I can not rely on paths relative the current working directory as far as process "B" is concerned. I really want to work with paths relative to a file descriptor. How can achieve that - without running into the issue demonstrated in the second sequence of operations? Calling fsync from process "A" just after write(fd_a, 'abc') does not solve this.
The reason why your second case overwrites everything with zeroes is that mode=truncate (i.e. openat(.., O_TRUNC)) will first truncate the file to length 0.
If you instead ftruncate to 200 immediately without first truncating to 0, the existing data up until that point will remain untouched.
There is quite a common issue in unix world, that is when you start a process with parameters, one of them being sensitive, other users can read it just by executing ps -ef. (For example mysql -u root -p secret_pw
Most frequent recommendation I found was simply not to do that, never run processes with sensitive parameters, instead pass these information other way.
However, I found that some processes have the ability to change the parameter line after they processed the parameters, looking for example like this in processes:
xfreerdp -decorations /w:1903 /h:1119 /kbd:0x00000409 /d:HCG /u:petr.bena /parent-window:54526138 /bpp:24 /audio-mode: /drive:media /media /network:lan /rfx /cert-ignore /clipboard /port:3389 /v:cz-bw47.hcg.homecredit.net /p:********
Note /p:*********** parameter where password was removed somehow.
How can I do that? Is it possible for a process in linux to alter the argument list they received? I assume that simply overwriting the char **args I get in main() function wouldn't do the trick. I suppose that maybe changing some files in /proc pseudofs might work?
"hiding" like this does not work. At the end of the day there is a time window where your password is perfectly visible so this is a total non-starter, even if it is not completely useless.
The way to go is to pass the password in an environment variable.
I don't find a way to check the free space available in a device using Haxe, Openfl, Lime or another library.
I would like to avoid download data that will exceed the size recommended for an app in each device.
What do you do to check that?
Try creating a file of that size! Then either delete it or reopen and write (not append) over its contents.
I don't know whether all platforms Haxe supports will work fine with this trick, but this algorithm is reported to work in many places and languages (I personally tested it in Ruby and saw the same suggestion for C++/.NET). To check whether X bytes of disk space are available:
open a new file for writing
seek X-1 bytes from the beginning
write a byte of data (whatever you want, 0, 42...)
close the file (probably unrelated to the task at hand, but don't forget to do that anyway)
If there's insufficient disk space, you'll likely get an exception at some point in this algorithm. You'll have to find out what errors to expect and process them properly.
Using ihx I've found this is working and requires nothing but Haxe Standard Library:
haxe interactive shell v0.3.4
type "help" for help
>> import sys.io.*;
>> var f = File.write('loca', true)
sys.io.FileOutput : { __f => #abstract }
>> f.seek(39999, FileSeek.SeekBegin)
Void : null
>> f.writeByte(0)
Void : null
>> f.close()
Void : null
After these manipulations, I had a file named loca of exactly 40000 bytes in my working directory.
By the way, be careful when doing things like these in ihx since it re-runs the entire session with the last entered line appended each time.
Ongoing experimentation:
However, when there's insufficient disk space, it may not fail with errors. In this case you'll have to check the real size with sys.FileSystem.stat(path).size. And don't forget to delete the file if there's not enough space.
I've got question about program architecture.
Say you've got 100 different log files with different formats and you need to parse and put that info into an SQL database.
My view of it is like:
use general config file like:
program1->name1("apache",/var/log/apache.log) (modulename,path to logfile1)
program2->name2("exim",/var/log/exim.log) (modulename,path to logfile2)
....
sqldb->configuration
use something like a module (1 file per program) type1.module (regexp, logstructure(somevariables), sql(tables and functions))
fork or thread processes (don't know what is better on Linux now) for different programs.
So question is, is my view of this correct? I should use one module per program (web/MTA/iptablat)
or there is some better way? I think some regexps would be the same, like date/time/ip/url. What to do with that? Or what have I missed?
example: mta exim4 mainlog
2011-04-28 13:16:24 1QFOGm-0005nQ-Ig
<= exim#mydomain.org.ua** H=localhost
(exim.mydomain.org.ua)
[127.0.0.1]:51127 I=[127.0.0.1]:465
P=esmtpsa
X=TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32
CV=no A=plain_server:spam S=763
id=1303985784.4db93e788cb5c#mydomain.org.ua T="test" from
<exim#exim.mydomain.org.ua> for
test#domain.ua
everything that is bold is already parsed and will be putted into sqldb.incoming table. now im having structure in perl to hold every parsed variable like $exim->{timstamp} or $exim->{host}->{ip}
my program will do something like tail -f /file and parse it line by line
Flexability: let say i want to add supprot to apache server (just timestamp userip and file downloaded). all i need to know what logfile to parse, what regexp shoud be and what sql structure should be. So im planning to have this like a module. just fork or thread main process with parameters(logfile,filetype). Maybe further i would add some options what not to parse (maybe some log level is low and you just dont see mutch there)
I would do it like this:
Create a config file that is formatted like this: appname:logpath:logformatname
Create a collection of Perl class that inherit from a base parser class.
Write a script which loads the config file and then loops over its contents, passing each iteration to its appropriate handler object.
If you want an example of steps 1 and 2, we have one on our project. See MT::FileMgr and MT::FileMgr::* here.
The log-monitoring tool wots could do a lot of the heavy lifting for you here. It runs as a daemon, watching as many log files as you could want, running any combination of perl regexes over them and executing something when matches are found.
I would be inclined to modify wots itself (which its licence freely allows) to support a database write method - have a look at its existing handle_* methods.
Most of the hard work has already been done for you, and you can tackle the interesting bits.
I think File::Tail is a nice fit.
You can make an array of File::Tail objects and poll them with select like this:
while (1) {
($nfound,$timeleft,#pending)=
File::Tail::select(undef,undef,undef,$timeout,#files);
unless ($nfound) {
# timeout - do something else here, if you need to
} else {
foreach (#pending) {
# here you can handle log messages depending on filename
print $_->{"input"}." (".localtime(time).") ".$_->read;
}
(from perl File::Tail doc)
I'm having the following problem. I want to write a program in Fortran90 which I want to be able to call like this:
./program.x < main.in > main.out
Additionally to "main.out" (whose name I can set when calling the program), secondary outputs have to be written and I wanted them to have a similar name to either "main.in" or "main.out" (they are not actually called "main"); however, when I use:
INQUIRE(UNIT=5,NAME=sInputName)
The content of sInputName becomes "Stdin" instead of the name of the file. Is there some way to obtain the name of files that are linked to stdin/stdout when the program is called??
Unfortunately the point of i/o redirection is that you're program doesn't have to know what the input/output files are. On unix based systems you cannot look at the command line arguments as the < main.in > main.out are actually processed by the shell which uses these files to set up standard input and output before your program is invoked.
You have to remember that sometimes the standard input and output will not even be files, as they could be a terminal or a pipe. e.g.
./generate_input | ./program.x | less
So one solution is to redesign your program so that the output file is an explicit argument.
./program.x --out=main.out
That way your program knows the filename. The cost is that your program is now responsible for openning (and maybe creating) the file.
That said, on linux systems you can actually find yout where your standard file handles are pointing from the special /proc filesystem. There will be symbolic links in place for each file descriptor
/proc/<process_id>/fd/0 -> standard_input
/proc/<process_id>/fd/1 -> standard_output
/proc/<process_id>/fd/2 -> standard_error
Sorry, I don't know fortran, but a psudeo code way of checking the output file could be:
out_name = realLink( "/proc/"+getpid()+"/fd/1" )
if( isNormalFile( out_name ) )
...
Keep in mind what I said earlier, there is no garauntee this will actually be a normal file. It could be a terminal device, a pipe, a network socket, whatever... Also, I do not know what other operating systems this works on other than redhat/centos linux, so it may not be that portable. More a diagnostic tool.
Maybe the intrinsic subroutines get_command and/or get_command_argument can be of help. They were introduced in fortran 2003, and either return the full command line which was used to invoke the program, or the specified argument.