How to solve a linear system in Linux shell? - linux

Does anyone know of a Linux command that reads a linear system of equations from its standard input and writes the solution (if exists) in its standard output?
I want to do something like this:
generate_system | solve_system

You can probably write your own such command using this package.

This is an old question, but showed up in my searches for this problem, so I'm adding an answer here.
I used maxima's solve function. Wrangling the input/output to/from maxima is a bit of a challenge, but can be done.
prepare the system of equations as a comma-separated list -- for a example, EQs="C[1]+C[2]=1,C[1]-C[2]=2". I wanted a solution for an unknown number of variables, so I used C[n], but you can use variable names.
prepare a list of variables you wish to solve for -- EQ_VARS="C[1],C[2]"
Maxima will echo all inputs, use line wrap, and return a solution in the form [C[1]=...,C[2]=..]. We need to resolve all of these.
Taken together, this becomes
OUT_VALS=( \
$(maxima --very-quiet \
--batch-string="display2d:false\$linel:9999\$print(map(rhs,float(solve([$EQs],[$EQ_VARS]))[1]))\$" \
| tail -n 1 \
| tr -c '0-9-.e' ' ') )
which will place the solution values into the array $OUT_VALS.
Note that this only properly handles that Maxima output if your problem is correctly constrained -- if you have zero, or more than one solution, the output will not be parsed correctly.

Related

Why is the character x used to mean extract in linux commands?

Why is the character x used to mean extract in linux commands?
For example, in the command tar xf helloworld, the x means extract.
Why is x used to mean extract instead of k or u?
There isn't always a clear reason why certain letters are chosen to apply certain options. Especially as there only 26 letters in the alphabet, and sometimes more than 26 options. In this case, long options tend to be used such as --extract, or a seemingly random letter. An example in the tar command you mentioned is -h for --dereference - this is what the author chose to use.
In this case, I believe that the -x is from eXtract. If you say "x" out loud, it sounds like "Ex" - the first syllable of the word.
-k is used as shorcut for --keep-old-files and -u is used for --update so they can't be used for --extract.

In a bash function, how do I get stdin into a variable

I want to call a function with a pipe, and read all of stdin into a variable.
I read that the correct way to do that is with read, or maybe read -r or read -a. However, I had a lot of problems in practise doing that (esp with multi-line strings).
In the end I settled on
function example () {
local input=$(cat)
...
}
What is the idiomatic way to do this?
input=$(cat) is a perfectly fine way to capture standard input if you really need to. One caveat is that command substitutions strip all trailing newlines, so if you want to make sure to capture those as well, you need to ensure that something aside from the newline(s) is read last.
input=$(cat; echo x)
input=${input%x} # Strip the trailing x
Another option in bash 4 or later is to use the readarray command, which will populate an array with each line of standard input, one line per element, which you can then join back into a single variable if desired.
readarray foo
printf -v foo "%s" "${foo[#]}"
I've found that using cat is really slow in comparison to the following method, based on tests I've run:
local input="$(< /dev/stdin)"
In case anyone is wondering, < is just input redirection. From the bash-hackers wiki:
When the inner command is only an input redirection, and nothing else,
for example
$( <FILE )
# or
` <FILE `
then Bash attempts to read the given file and act just if the given
command was cat FILE.
Remarks about portability
In terms of how portable this method is, you are likely to go your entire linux user career, and never use a linux system which doesn't have /dev/stdin, but in case you want to satisfy that itch, here is a question on Unix Stackexchange which questions portability of directly accessing /dev/{stdin,stdout,stderr} and friends.
One more thing I've come across when working with linux containers such as ones built with docker, or buildah, is that there are situations where /dev/stdin or even /dev/stdout are not available inside the container. I've not been able to conclusively say what causes this.
There are a few overlapping / very similar questions floating around on SO. I answered this here, using the read built-in:
https://stackoverflow.com/a/58452863/3220983
In my answers there, however, I am ONLY concerned with a single line.
The arguable weakness of the cat approach, is that requires spawning a subshell. Otherwise, it's a good one. It's probably the easiest way to deal with multi line processing, as specifically queried here.
I think the read approach is faster / more resource efficient if you are trying to chain a lot of commands, or iterate through a list calling a function repeatedly.

The difference between arguments and options pertaining to the linux shell

I'm currently enrolled in an intro to Unix / Linux class and we came to a question that the instructor and I did not agree on.
cp -i file1 file2
Which is true about the preceding command?
A. There is only one utility
B. There is one option
C. There are three arguments
D. file1 will be copied as file2 and the user will be warned before
an overwrite occures
E. All of the above
I insisted that it was E. All of the above. The instructor has settled on D.
It seems clear that A, B, and D are all correct. The hang up was C and whether or not the -i flag was both an option and an argument.
My logic was that all options are arguments but not all arguments are options and since there are multiple true answers listed, then in multiple choice question tradition the answer is more than likely to be E all of the above.
I haven't been able to find the smoking gun on this issue and thought I would throw it to the masters.
I know this is an old thread, but I want to add the following for anyone else that may stumble into a similar disagreement.
$ ls -l junk
-rw-r--r-- 1 you 19 Sep 26 16:25 junk
"The strings that follow the program name on the command line, such as -l
and junk in the example above, are called the program's arguments. Arguments are usually options or names of files to be used by the command."
Brian W. Kernighan & Rob Pike, "The UNIX Programming Environment"
The manual page here states:
Mandatory arguments to long options are mandatory for short options too.
This seems to imply that in the context of this particular question, at least, you're supposed to not consider options to be arguments. Otherwise it becomes very recursive and kind of pointless.
I think the instructor should accept your explanation though, this really is splitting hairs for most typical cases.
I think the term "arguments" is used in different ways in different contexts, which is the root of the disagreement, it seems. In support of your stance though, note that the C runtime, upon which cp was most likely written, declares the program entry point as main(argc, argv) (types elided), which seems to indicate at least that those who designed the C architecture/library/etc. thought of options as a subset of arguments. Then of course options can have their own arguments (different context), etc....
This is how I was taught, it is said in this case:
cp -i file1 file2
The right answer would be A B and D but not C.
Since -i is an option and file1 and file2 are arguments. Normally options are considered to change the behaviour of an application or command where as arguments do not.
I suppose it is up to semantics as to whether you consider -i an argument of the original application since it is a behaviour changing option (or argument) of cp but it is considered in English an option not a argument.
That's how I still define the difference and keep the difference between the two parts of a command.
As another command example, cronjobs. I often use PHP cronjobs and I normally have both options and arguments associated with the command. Options are always used (in my opinion) to define extra behaviour while arguments are designed to provide the app and it's behaviours with the data it requires to complete the operation.
Edit
I agree with #unwind this is splitting hairs and actually a lot of times comes down to scenario and opinion. It was quite bad of him to even mark on it really, he should of known this is a subjective question. Tests are completely unfair when filled with subjective questions.
Hmmm... I personally like to distinguish between options and arguments however, you could technically say that options are arguments. I would say that you are correct but I think your instructor settled on D because he doesn't want you to get them confused. For example, the following is equivalent to the above command...:
ARG1="-i" ; ARG2="file1" ; ARG3="file2" ; cp $ARG1 $ARG2 $ARG3
ARG1="-i" ; ARG2="file1" ; ARG3="file2" ; cp $ARG2 $ARG1 $ARG3
ARG1="-i" ; ARG2="file1" ; ARG3="file2" ; cp $ARG2 $ARG3 $ARG1
...whereas cp $ARG1 $ARG3 $ARG2 is not the same. I would say that options are a special type of arguments.

Learning/Detecting Mutatable Parts of a URL in Logs

Say you have a webserver log (apache, nginx, whatever). From it you extract a large list of URLs:
/article/1/view
/article/2/view
/article/1/view
/article/1323/view
/article/1/edit
/help
/article/1/view
/contact
/contact/thank-you
/article/8/edit
...
or
/blog/2012/06/01/how-i-will-spend-my-summer-vacation
/blog/2012/08/30/how-i-wasted-my-summer-vacation
...
You explode these urls into their pieces such that you have ['article', '1323', 'view'] or ['blog', '2012', '08', '30', 'how-i-wasted-my-summer-vacation'].
How would one go about analyzing and comparing these urls to detect and call out "variables" in the url path. That is to say, you would want to recognize things like /article/XXX/view, /article/XXX/edit, and /blog/XXX/XXX/XXX/XXX such that you can summarize information about those lines in the logs.
I assume that there will need to be some statistical threshold for the number of differences that constitute a mutable piece vs a similar looking but different template. I am also unsure as to what data structure would make this analysis quick and easy.
I would like the output of the script to output what it thinks are all the url templates that are present on the server, possibly with some confidence value if appropriate.
A simple solution would be to count path occurrences and learn which values correspond to templates. Assume that the file input contains the URLs from your first snippet. Then compute the per-path visits:
awk -F '/' '{ for (i=2; i<=NF; ++i) { for (j=2; j<=i; ++j) printf "/%s", $j; printf "\n" }}' input \
| sort \
| uniq -c \
| sort -rn
This yields:
7 /article
4 /article/1
3 /article/1/view
2 /contact
1 /help
1 /contact/thank-you
1 /article/8/edit
1 /article/8
1 /article/2/view
1 /article/2
1 /article/1323/view
1 /article/1323
1 /article/1/edit
Now you have a weight for each path which you can feed into a score function f(x, y), where x represents the count and y the depth of the path. For example, the first line would result in the invocation f(7,2) and may return a value in [0,1], say 0.8, to tell you that the given parametrization corresponds to a template with 80%. Of course, all the magic happens in f and you would have to come up with reasonable values based on the paths that you see being accessed. To develop a good f, you could use logistic regression on some a small data set and see if it predicts well the binary feature of being a template or not.
You can also take a mundane route: just drop the tail, e.g., all values <= 1.
How about using a DAWG? Except the nodes would store not letters, but the URI pieces. Like this:
This is a very nice data structure: it has pretty minimal memory requirements, it's easy to traverse, and, being a DAG, there are plenty of easy and well-researched algorithms for it. It also happens to describe the state machine that accepts all URLs in the sample and rejects all others (so we might actually build a regular expression out of it, which is very neat, but I'm not clever enough to know how to go about it from there).
Anyhow, with a structure like this, your problem translates into that of finding the "bottlenecks". I'd guess there are proper algorithms for that, but with a large enough sample where variables vary wildly enough, it's basically this: the more nodes there are at a certain depth, the more likely it's a mutable part.
A probably naive approach to do it would be like this: keeping separate DAWGs for every starting part, I'd find the mean width of the DAWG (possibly weighted based on the depth). And if a level's width is above that mean, I'd consider it a variable with the probability depending on how far away it is from the mean. You may very well unleash the power of statistics at this point. modeling the distribution of the width.
This approach wouldn't fare well with independent patterns starting with the same part, like "shop/?/?" and "shop/admin/?/edit". This could be perhaps mitigated by examining the DAWG-s in a more dynamic fashion, using a sliding window of sorts, always examining only a part of the DAWG at once, but I don't know how. Oh and, the whole thing fails horribly if the very first part is a variable, but that's thankfully rare.
You may also look out for certain little things like all nodes of the same level having numerical values (more likely to be a variable), and I'd certainly check for common date patterns in the sample before building the DAWGs, factoring them out would make handling the blog-like patterns easier.
(Oh and, adding the "algorithm" tag would probably attract more attention to the question.)

How can I loop through variables in SPSS? I want to avoid code duplication

Is there a "native" SPSS way to loop through some variable names? All I want to do is take a list of variables (that I define) and run the same procedure for them:
pseudo-code - not really a good example, but gets the point across...
for i in varlist['a','b','c']
do
FREQUENCIES VARIABLES=varlist[i] / ORDER=ANALYSIS.
end
I've noticed that people seem to just use R or Python SPSS plugins to achieve this basic array functionality, but I don't know how soon I can get those configured (if ever) on my installation of SPSS.
SPSS has to have some native way to do this...right?
There are two easy solutions for looping through variables (easier compared to using Python in SPSS).
1) DO REPEAT-END REPEAT
The draw back is that you can use DO REPEAT-END REPEAT mainly only for data transformations - for example COMPUTE, RECODE etc. Frequencies are not allowed. For example:
DO REPEAT R=REGION1 TO REGION5.
COMPUTE R=0.
END REPEAT.
2) DEFINE-!ENDDEFINE (macro facility)
You can do Frequencies in a loop of variables using macro command. For example:
DEFINE macdef (!POS !CHAREND('/'))
!DO !i !IN (!1)
frequencies variables = !i.
!DOEND
!ENDDEFINE.
macdef VAR1 VAR2 VAR3 /.
If I understand the question correctly, there may be no need to use a looping construct. SPSS commands with a VARIABLES subcommand like FREQUENCIES allow you to specify multiple variables.
The basic syntax for the FREQUENCIES is:
FREQUENCIES
VARIABLES= varlist [varlist...]
where [varlist] is a single variable name, multiple space-delimited variable names, a range of consecutive variables specified with the TO keyword, the keyword ALL, or a combination of the previous options.
For example:
FREQUENCIES VARIABLES=VARA
FREQUENCIES VARIABLES=VARA VARB VARC
FREQUENCIES VARIABLES=VARA TO VARC
FREQ VAR=ALL
FREQ VAR=VARA TO VARC VARM VARX TO VARZ
See SPSS Statistics 17.0 Command Syntax Reference available at http://support.spss.com/ProductsExt/SPSS/Documentation/SPSSforWindows/index.htm
Note that it's been years since I've actually used SPSS.
It's more efficient to do all these frequencies on one data pass, e.g.,
FREQUENCIES a to c.
but Python lets you do looping and lots of other control flow tricks.
begin program.
import spss
for v in ['a','b','c']:
spss.Submit("FREQUENCIES " + v)
end program.
Using Python requires installing the (free) Python plugin available from SPSS Developer Central, www.spss.com/devcentral.
You can, of course, use macros for this sort of things, but Python is a lot more powerful and easier once you get the hang of it.
Yes, SPSS can do this. Sounds like the guys at UCLA use python 'cause they know how to do it in python and not in SPSS. :)
Let's call your variables VARA, VARB, VARC. They must be numerical (since you are doing frequencies) and they must be consecutive in your spss data file. Then you create a vector saying in effect "here is the series of variables I want to loop through".
VECTOR VectorVar = VarA TO VarC.
LOOP #cnt = 1 to 3 by 1.
FREQUENCIES VARIABLES=VectorVar(#cnt) / ORDER=ANALYSIS
ENDLOOP.
EXECUTE.
(The above has not been tested. Might be missing a period somewhere, etc.)
Here's a page from UCLA's Academic Technology Services that describes looping over lists of variables. Quote,
"Because we are looping through more
than one variable, we will need to use
Python."
In my experience, UCLA ATS is probably the site with the best coverage of all of the major statistical computing systems. If they say you need Python... you probably need Python.
Er... sorry for being that guy, but maybe it's time to switch to a different stats system.
I haven't used SPSS macros very much, but maybe they can get you where you need to be? Check out this site for some examples:
http://spsstools.net/Macros.htm
Also, the SPSS Data Management book may be helpful as well.
Lastly, if memory serves, I think the problem may even be the main example of how to leverage Python inside of SPSS syntax. I have only used Python and SPSS a few times, but it is very handy to have that language accessible if need be.
HTH
How can do this stata sintxis for spss.
foreach var of varlist pob_multi pob_multimod pob_multiex vul_car vul_ing nopob_nov espacio carencias carencias_3 ic_rezedu ic_asalud ic_ss ic_cv ic_sbv ic_ali pobex pob {
tabstat `var' [w=factor] if pob_multi!=., stats(mean) save
matrix define `var'_pp =(r(StatTotal))
matrix rownames `var'_pp = `var'_pp
}
matrix tabla1 = (pob_multi_pp \ pob_multimod_pp \ pob_multiex_pp \ vul_car_pp \ vul_ing_pp \ nopob_nov_pp \ espacio_pp \ carencias_pp \ carencias_3_pp \ espacio_pp \ ic_rezedu_pp\ ic_asalud_pp \ ic_ss_pp \ ic_cv_pp \ ic_sbv_pp\ ic_ali_pp \ espacio_pp \ pobex_pp \ pob_pp )
matrix list tabla1
thanks.

Resources