Display file metadata like owner and group using Rust - linux

I'm working recursive file search and got it to work with simple permissions, but I can't determinate how to get the owner (owner id) or the group (group id) of an folder or file. I've discovered how to get the current permissions of an file or folder. I get an uint_32 so about 9 bits of this are used to save the permission. But where and how are the timestamp is saved? and the owner? On my research I've read that the linux kernel allowing more than 4 billion users on an system. Obviously this isn't in the uint_32 that I'm getting.
I'm working in rust and would not fear to write a C module.
But now here is my main.rs:
use std::fs::*;
use std::os::unix::fs::MetadataExt;
use std::os::unix::fs::PermissionsExt;
use std::mem::transmute;
fn main(){
let meta = metadata("./test.txt");
if meta.is_ok(){
let m:u32 = meta.unwrap().permissions().mode();
//let bytes: [u8; 4] = unsafe { transmute(m.to_be()) };//etv. used later
print!("{}",if (m & (0x1<<9)) >= 1 {"d"}else{"-"});
print!("{}",if (m & (0x1<<8)) >= 1 {"r"}else{"-"});
print!("{}",if (m & (0x1<<7)) >= 1 {"w"}else{"-"});
print!("{}",if (m & (0x1<<6)) >= 1 {"x"}else{"-"});
print!("{}",if (m & (0x1<<5)) >= 1 {"r"}else{"-"});
print!("{}",if (m & (0x1<<4)) >= 1 {"w"}else{"-"});
print!("{}",if (m & (0x1<<3)) >= 1 {"x"}else{"-"});
print!("{}",if (m & (0x1<<2)) >= 1 {"r"}else{"-"});
print!("{}",if (m & (0x1<<1)) >= 1 {"w"}else{"-"});
println!("{}",if (m & 0x1) >= 1 {"x"}else{"-"});
println!("{:b}",m);
}
}
Do not hesitate to modify my code if you think so.
I'm doing this for fun and to learn more about the code underneath the horizon.

std::os::linux::fs::MetadataExt (or os::unix) provides relevant platform-specific functions. Reference. Looks like you need meta.std_uid(), meta.st_gid(), etc. By the way, it is much better to write your code like this:
if let Ok(meta) = metadata("./test.txt") {
println!("{}", meta.st_gid());
// ...
}
I'm working in rust and would not fear to write a C module
Rust has excellent FFI for such cases. For example, you can add libc crate with libc bindings and just call libc::stat function with familiar API.

Owner is meta.unwrap().uid() and group is meta.unwrap().gid(). They are u32 each, which is what Linux uses.
To get the actual names, use libc::getpwuid_r and libc::getgrgid_r. See also getpwuid(3) and getgrgid(3).

Related

What does applying XOR between the input instructions & account data accomplish in this Solana smart contract?

https://github.com/solana-labs/break/blob/master/program/src/lib.rs
use solana_program::{
account_info::AccountInfo, entrypoint, entrypoint::ProgramResult, pubkey::Pubkey,
};
entrypoint!(process_instruction);
fn process_instruction<'a>(
_program_id: &Pubkey,
accounts: &'a [AccountInfo<'a>],
instruction_data: &[u8],
) -> ProgramResult {
// Assume a writable account is at index 0
let mut account_data = accounts[0].try_borrow_mut_data()?;
// xor with the account data using byte and bit from ix data
let index = u16::from_be_bytes([instruction_data[0], instruction_data[1]]);
let byte = index >> 3;
let bit = (index & 0x7) as u8;
account_data[byte as usize] ^= 1 << (7 - bit);
Ok(())
}
This is from one of their example applications, really not sure what to make of this, or where one might even begin to look in to understanding what the intent is here and how it functions..
Thanks in advance.
EDIT:
Is this done to create a program-derived address? I found this on their API, and the above seems to make sense as an implementation of this I would imagine.
1 << n sets nth bit of what's called a mask, such as 1 << 1 = 0010.
XOR is an useful operation that allows comparing bits, and in this case, it makes use of that property. If current bit is 0, it will be set to 1, and if it is 1, it will be set to 0.
Using the mask from above, we can select one single specific bit to compare, or in this case, switch based on what value it currently is.
1111 ^ 0010 = 1101, the result is a difference, with the bit that matched being set to 0.
1101 ^ 0010 = 1111, every single bit here is different, and so the bit that didn't match is also set to 1.
In short, it toggles a bit, it is a common idiom in bit manipulation code.
bits ^= 1 << n
Related: https://stackoverflow.com/a/47990/15971564

what is the right way to access builtin_bswap functions?

I have an application that uses a database with data stored in big-endian order. To access this data portably across hardware platforms, I use 4 macros defined in a config.h module:
word(p) - gets a big-endian 16 bit value at pointer p as a native 16-bit value.
putword(p, w) - stores a native 16-bit variable (w) to pointer p as 16-bit big-endian.
dword(p) and putdword(p, d) do the same for 32-bit values
This all works fine, but the macros on a little-endian machine use the brute-force 'shift and mask' approach.
Anyway, it looks like there are builtin_bswap16 and builtin_bswap32 functions on linux that may do this more efficiently (as inline assembler code?). So what's the right way to code my word/putword macros so that they use these builtin functions on an X86_64 linux machine? Would coding my macros as htons/l function calls do the same thing as efficiently - and is it necessary to enable compiler optimiation to get any of these solutions to work? I'd rather not optimize if it renders gdb useless.
Hmmm. I wrote a trivial test program using no special include files and simply calling the __builtin_swap... functions directly (see the 'fast...' macros below). It all just works. When I disassemble the code in gdb, I see that the fast... macros do in 4-5 assembler instructions what takes up to 27 instructions for the worst case 'dword' macro. Pretty neat improvement for almost no effort.
typedef unsigned char uchar;
typedef unsigned short ushort;
typedef unsigned int uint;
#define word(a) (ushort) ( (*((uchar *)(a)) << 8) | \
(*((uchar *)(a) + 1)) )
#define putword(a,w) *((char *)(a)) = (char) (((ushort)((w) >> 8)) & 0x00ff), \
*((char *)(a)+1) = (char) (((ushort)((w) >> 0)) & 0x00ff)
#define dword(a) (uint) ( ((uint)(word(a)) << 16) | \
((uint)(word(((uchar *)(a) + 2)))) )
#define putdword(a,d) *((char *)(a)) = (char) (((uint)((d) >> 24)) & 0x00ff), \
*((char *)(a)+1) = (char) (((uint)((d) >> 16)) & 0x00ff), \
*((char *)(a)+2) = (char) (((uint)((d) >> 8)) & 0x00ff), \
*((char *)(a)+3) = (char) (((uint)((d) >> 0)) & 0x00ff)
#define fastword(a) (ushort) __builtin_bswap16(* ((ushort *) a));
#define fastputword(a, w) *((ushort *) a) = __builtin_bswap16((ushort)w);
#define fastdword(a) (uint) __builtin_bswap32(* ((uint *) a));
#define fastputdword(a, d) *((uint *) a) = __builtin_bswap32((uint)d);
int main()
{
unsigned short s1, s2, s3;
unsigned int i1, i2, i3;
s1 = 0x1234;
putword(&s2, s1);
s3 = word(&s2);
i1 = 0x12345678;
putdword(&i2, i1);
i3 = dword(&i2);
printf("s1=%x, s2=%x, s3=%x, i1=%x, i2=%x, i3=%x\n", s1, s2, s3, i1, i2, i3);
s1 = 0x1234;
fastputword(&s2, s1);
s3 = fastword(&s2);
i1 = 0x12345678;
fastputdword(&i2, i1);
i3 = fastdword(&i2);
printf("s1=%x, s2=%x, s3=%x, i1=%x, i2=%x, i3=%x\n", s1, s2, s3, i1, i2, i3);
}
I would just use htons, htonl and friends. They're a lot more portable, and it's very likely that the authors of any given libc will have implemented them as inline functions or macros that invoke __builtin intrinsics or inline asm or whatever, resulting in what should be a nearly-optimal implementation for that specific machine. See what is generated in godbolt's setup, which I think is some flavor of Linux/glibc.
You do need to compile with optimizations for them to be inlined, otherwise it generates an ordinary function call. But even -Og gets them inlined and should not mess up your debugging as much. Anyway, if you're compiling without optimizations altogether, your entire program will be so inefficient that the extra couple instructions to call htons must surely be the least of your worries.

Handling f64 or Complex64 return types. Generics? Either?

I have a functioning Rust program using real doubles (f64) as the underlying type and wish to extend the system such that it can also handle complex values (num::complex::Complex64).
A (cut down example) function takes some configuration struct config, and depending on that input generates a potential value at an index idx:
fn potential(config: &Config, idx: &Index3) -> Result<f64, Error> {
let num = &config.grid.size;
match config.potential {
PotentialType::NoPotential => Ok(0.0),
PotentialType::Cube => {
if (idx.x > num.x / 4 && idx.x <= 3 * num.x / 4) &&
(idx.y > num.y / 4 && idx.y <= 3 * num.y / 4) &&
(idx.z > num.z / 4 && idx.z <= 3 * num.z / 4) {
Ok(-10.0)
} else {
Ok(0.0)
}
}
PotentialType::Coulomb => {
let r = config.grid.dn * (calculate_r2(idx, &config.grid)).sqrt();
if r < config.grid.dn {
Ok(-1. / config.grid.dn)
} else {
Ok(-1. / r)
}
}
}
}
I now wish to add a ComplexCoulomb match which returns a Complex64 value:
PotentialType::ComplexCoulomb => {
let r = config.grid.dn * (calculate_r2(idx, &config.grid)).sqrt();
if r < config.grid.dn {
Ok(Complex64::new(-1. / config.grid.dn, 1.))
} else {
Ok(Complex64::new(-1. / r, 1.))
}
}
This function is an early entry point in my program, which fills an ndarray::Array3; currently I'm operating on a number of variables with the type ndarray::Array3<f64> - so I need to generalise the whole program, not just this function.
How can I extend this program to use both types based on the input from config? This struct comes from parsing a configuration file on disk and will match a number of PotentialType::Complex* values.
I'm aware of two possible options, but am unsure if either fits my criteria.
Use something similar to Either and return Left for real and Right for complex; then use additional logic to treat the values separately in other functions.
Use generic types. This isn't something I've done too much of before and generalisation over many types seems like a fair chunk of complicated alteration of my current code base. Is there a way to reduce the complexity here?
If you have any other suggestions I'd love to hear them!
There might be a lot of code change, but using generic parameters is probably the most flexible approach, and it won't impact performance. Passing around an enum will be less performant, partly because the enum will be bigger (the size of the larger variant plus a tag to discriminate between them) and partly because the enum variant will have to be frequently checked.
One thing that can get cumbersome is the potentially long list of traits that constrain your type parameter. This can be done on the impl level, rather than on each function, to save repetition. There isn't currently a way to alias a set of traits, which would make this more ergonomic, but there is an RFC approved for that.
I made a very similar change in the Euclid library. It was more than a year ago, so much has changed since then, both in Rust and in that library, but a quick look over that commit should still give you an idea of the amount of changes necessary.
This is the current state of the same (renamed) implementation:
impl <T, Src, Dst> TypedTransform3D<T, Src, Dst>
where T: Copy + Clone +
Add<T, Output=T> +
Sub<T, Output=T> +
Mul<T, Output=T> +
Div<T, Output=T> +
Neg<Output=T> +
ApproxEq<T> +
PartialOrd +
Trig +
One + Zero {
// methods of TypedTransform3D defined here...
}
Some of those traits (Trig, One, Zero) are actually defined inside the crate, as they aren't in the standard library.

Linux Framebuffer modes

I am currently working on embedded linux framebuffers. I know how to display the available resolutions of my system by typing:
cat /sys/class/graphics/fb0/modes
This gives me a list of resolutions, for instance:
U:720x576p-50
D:1920x1080p-50
D:1280x720p-50
D:1920x1080i-60
D:1920x1080i-50
U:1440x900p-60
S:1280x1024p-60
V:1024x768p-60
V:800x600p-60
V:640x480p-60
D:1280x720p-60
D:1920x1080p-60
I would like to know what does the first character of each lines mean (S, U, V or D).
Is there a standard/documentation listing all the possible characters?
From the linux kernel source function mode_string()
char m = 'U';
if (mode->flag & FB_MODE_IS_DETAILED)
m = 'D';
if (mode->flag & FB_MODE_IS_VESA)
m = 'V';
if (mode->flag & FB_MODE_IS_STANDARD)
m = 'S';
so it's U unknown, D detailed, V vesa, S standard.

Any programming language with "strange" function call?

I was wondering, is there any programming language where you can have function calls like this:
function_name(parameter1)function_name_continued(parameter2);
or
function_name(param1)function_continued(param2)...function_continued(paramN);
For example you could have this function call:
int dist = distanceFrom(cityA)to(cityB);
if you have defined distanceFromto function like this:
int distanceFrom(city A)to(city B)
{
// find distance between city A and city B
// ...
return distance;
}
As far as I know, in C, Java and SML programming languages, this cannot be done.
Are you aware of any programming language that let's you define and call
functions in this way?
It looks an awful lot like Objective-C
- (int)distanceFrom:(City *)cityA to:(City *)cityB {
// woah!
}
Sounds a lot like Smalltalk's syntax, (which would explain Objective-C's syntax - see kubi's answer).
Example:
dist := metric distanceFrom: cityA to: cityB
where #distanceFrom:to: is a method on some object called metric.
So you have "function calls" (they're really message sends) like
'hello world' indexOf: $o startingAt: 6. "$o means 'the character literal o"
EDIT: I'd said "Really, #distanceFrom:to: should be called #distanceTo: on a City class, but anyway." Justice points out that this couples a City to a Metric, which is Bad. There are good reasons why you might want to vary the metric - aeroplanes might use a geodesic while cars might use a shortest path based on the road network.)
For the curious, Agda2 has a similar, very permissive syntax. The following is valid code:
data City : Set where
London : City
Paris : City
data Distance : Set where
_km : ℕ → Distance
from_to_ : City → City → Distance
from London to London = 0 km
from London to Paris = 342 km
from Paris to London = 342 km
from Paris to Paris = 0 km
If
from Paris to London
is evaluated, the result is
342 km
Looks a lot like a fluent interface or method chaining to me.
In Python, you can explicitly pass the name of the arguments you're calling the function with, which lets you pass them in a different order or skip optional arguments:
>>> l = [3,5,1,2,4]
>>> print l.sort.__doc__
L.sort(cmp=None, key=None, reverse=False) -- stable sort *IN PLACE*;
cmp(x, y) -> -1, 0, 1
>>> l.sort (reverse=True)
>>> l
[5, 4, 3, 2, 1]
This looks a lot like what the Objective C syntax is doing, tagging each argument to a function with its name.
C# 4.0's Named and Optional Arguments feature allows you to achieve something pretty similar:
public static int Distance(string from, string to, string via = "")
{
...
}
public static void Main()
{
int distance;
distance = Distance(from: "New York", to: "Tokyo");
distance = Distance(to: "Tokyo", from: "New York");
distance = Distance(from: "New York", via: "Athens", to: "Tokyo");
}
(see my very favourite personal effort - the final C++ approach at the end of this answer)
Language One
Objective-C but the calling syntax is [object message] so would look like:
int dist = [cities distanceFrom:cityA to:cityB];
if you have defined distanceFromto function like this, within a cities object:
- (int)distanceFrom:(City *)cityA to:(City *)cityB
{
// find distance between city A and city B
// ...
return distance;
}
Language Two
I also suspect you could achieve something very close to this in the IO Language but I'm only just looking at it. You may also want to read about it in comparison to other languages in Seven Languages in Seven Weeks which has a free excerpt about IO.
Language Three
There's an idiom ("chaining") in C++ where you return temporary objects or the current object that is used to replace keyword arguments, according to The Design and Evolution of C++ and looks like this:
int dist = distanceFrom(cityA).to(cityB);
if you have defined distanceFrom function like this, with a little helper object. Note that inline functions make this kind of thing compile to very efficient code.
class DistanceCalculator
{
public:
DistanceCalculator(City* from) : fromCity(from) {}
int to(City * toCity)
{
// find distance between fromCity and toCity
// ...
return distance;
}
private:
City* fromCity;
};
inline DistanceCalculator distanceFrom(City* from)
{
return DistanceCalculator(from);
}
Duhh, I was in a hurry earlier, realised I can refactor to just use a temporary object to give the same syntax:
class distanceFrom
{
public:
distanceFrom(City* from) : fromCity(from) {}
int to(City * toCity)
{
// find distance between fromCity and toCity
// ...
return distance;
}
private:
City* fromCity;
};
MY FAVOURITE
and here's an even more inspired C++ version that allows you to write
int dist = distanceFrom cityA to cityB;
or even
int dist = distanceFrom cityA to cityB to cityC;
based on a wonderfully C++ ish combination of #define and classes:
#include <vector>
#include <numeric>
class City;
#define distanceFrom DistanceCalculator() <<
#define to <<
class DistanceCalculator
{
public:
operator int()
{
// find distance between chain of cities
return std::accumulate(cities.begin(), cities.end(), 0);
}
DistanceCalculator& operator<<(City* aCity)
{
cities.push_back(aCity);
return *this;
}
private:
std::vector<City*> cities;
};
NOTE this may look like a useless exercise but in some contexts it can be very useful to give people a domain-specific language in C++ which they compile alongside libraries. We used a similar approach with Python for geo-modeling scientists at the CSIRO.
You can do this in C, albeit unsafely:
struct Arg_s
{
int from;
int to;
};
int distance_f(struct Arg_s args)
{
return args.to - args.from;
}
#define distance(...) distance_f( ((struct Arg_s){__VA_ARGS__}) )
#define from_ .from =
#define to_ .to =
uses compound literals and designated initializers.
printf("5 to 7 = %i\n",distance(from_ 5, to_ 7));
// 5 to 7 = 2
3 of the 4 confederated languages from RemObjects in their Elements Compiler have this capability in precisely the OP's requested syntax (to support Objective-C runtime, but made available to all operating systems).
in Hydrogene (an extended C#)
https://docs.elementscompiler.com/Hydrogene/LanguageExtensions/MultiPartMethodNames
in Iodine (an extended Java)
https://docs.elementscompiler.com/Iodine/LanguageExtensions/MultiPartMethodNames
in Oxygene (an extended ObjectPascal), scroll down to Multi-Part Method Names section
https://docs.elementscompiler.com/Oxygene/Members/Methods
This looks similar to function overloading (C++/C#)/default parameters (VB).
Default Parameters allow the person defining the function to set defaults for the latter parameters:
e.g. c# overloading:
int CalculateDistance(city A, city B, city via1, city via2)
{....}
int CalculateDistance(city A, city B)
{
return CalculateDistance(city A, city B, null, null)
}
You can use a member function for this.
cityA.distance_to(cityB);
That's valid code in C++, C(with a little tweaking), C#, Java. Using method chains, you can do:
cityA.something(cityB).something(cityC).something(cityD).something(cityE);
In SML you could simply make "to" some value (unit, for example), and "distanceFrom" a curried function that takes three parameters. For example:
val to = ()
fun distanceFrom x _ y = (* implementation function body *)
val foo = distanceFrom cityA to cityB
You could also take advantage of the fact that SML doesn't enforce naming conventions on datataype constructors (much to many peoples' annoyance), so if you want to make sure that the type system enforces your custom syntax:
datatype comp = to
fun distanceFrom x to y = (* implementation *)
val foo = distanceFrom cityA to cityB (* works *)
val foo' = distanceFrom cityA cityB (* whoops, forgot 'to' - type error! *)
You could do this in Scheme or LISP using macros.
The form will be something like:
(DISTANCE-FROM city-a TO city-b)
The symbols in uppercase denotes syntax.
You could even do something like 'named parameters':
(DISTANCE TO city-a FROM city-b)
(DISTANCE FROM city-a TO city-b)
Tcl allows you to do something like this:
proc distance {from cityA to cityB} {...}
set distance [distance from "Chicago IL" to "Tulsa OK"]
I'm not sure if that's quite what you are thinking of though.
You can do it in Java, Use Builder pattern that appears in the book Effective Java by Joshua Bosch (this is second time I put this link in SO, I still didn't use that patern, but looks great)
Well, in Felix you can implement this in two steps: first, you write an ordinary function. Then, you can extend the grammar and map some of the new non-terminals to the function.
This is a bit heavyweight compared to what you might want (welcome to help make it easier!!) I think this does what you want and a whole lot more!
I will give a real example because the whole of the Felix language is actually defined by this technique (below x is the non-terminal for expressions, the p in x[p] is a precedence code):
// alternate conditional
x[sdollar_apply_pri] := x[stuple_pri] "unless" x[let_pri]
"then" x[sdollar_apply_pri] =>#
"`(ast_cond ,_sr ((ast_apply ,_sr (lnot ,_3)) ,_1 ,_5))";
Here's a bit more:
// indexes and slices
x[sfactor_pri] := x[sfactor_pri] "." "[" sexpr "]" =>#
"`(ast_apply ,_sr (,(noi 'subscript) (,_1 ,_4)))";
x[sfactor_pri] := x[sfactor_pri] "." "[" sexpr "to" sexpr "]" =>#
"`(ast_apply ,_sr (,(noi 'substring) (,_1 ,_4 ,_6)))";
x[sfactor_pri] := x[sfactor_pri] "." "[" sexpr "to" "]" =>#
"`(ast_apply ,_sr (,(noi 'copyfrom) (,_1 ,_4)))";
x[sfactor_pri] := x[sfactor_pri] "." "[" "to" sexpr "]" =>#
"`(ast_apply ,_sr (,(noi 'copyto) (,_1 ,_5)))";
The Felix grammar is ordinary user code. In the examples the grammar actions are written in Scheme. The grammar is GLR. It allows "context sensitive keywords", that is, identifiers that are keywords in certain contexts only, which makes it easy to invent new constructs without worrying about breaking existing code.
Perhaps you would like to examine Felix Grammar Online.

Resources