SIMD code works in Debug, but does not in Release - rust

This code works in debug mode, but panics because of the assert in release mode.
use std::arch::x86_64::*;
fn main() {
unsafe {
let a = vec![2.0f32, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0];
let b = -1.0f32;
let ar = _mm256_loadu_ps(a.as_ptr());
println!("ar: {:?}", ar);
let br = _mm256_set1_ps(b);
println!("br: {:?}", br);
let mut abr = _mm256_setzero_ps();
println!("abr: {:?}", abr);
abr = _mm256_fmadd_ps(ar, br, abr);
println!("abr: {:?}", abr);
let mut ab = [0.0; 8];
_mm256_storeu_ps(ab.as_mut_ptr(), abr);
println!("ab: {:?}", ab);
assert_eq!(ab[0], -2.0f32);
}
}
(Playground)

I can indeed confirm that this code causes the assert to trip in release mode:
$ cargo run --release
Finished release [optimized] target(s) in 0.00s
Running `target/release/so53831502`
ar: __m256(2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
br: __m256(-1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0)
abr: __m256(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
abr: __m256(-1.0, -1.0, -1.0, -1.0, 0.0, 0.0, 0.0, 0.0)
ab: [-1.0, -1.0, -1.0, -1.0, 0.0, 0.0, 0.0, 0.0]
thread 'main' panicked at 'assertion failed: `(left == right)`
left: `-1.0`,
right: `-2.0`', src/main.rs:24:9
This appears to be a compiler bug, see here and here. In particular, you are calling routines like _mm256_set1_ps and _mm256_fmadd_ps, which require the CPU features avx and fma respectively, but neither your code nor your compilation command indicate to the compiler that such features should be used.
One way of fixing this is to tell the compiler to compile the entire program with both the avx and fma features enabled, like so:
$ RUSTFLAGS="-C target-feature=+avx,+fma" cargo run --release
Compiling so53831502 v0.1.0 (/tmp/so53831502)
Finished release [optimized] target(s) in 0.36s
Running `target/release/so53831502`
ar: __m256(2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
br: __m256(-1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0)
abr: __m256(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
abr: __m256(-2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
ab: [-2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
Another approach that achieves the same result is to tell the compiler to use all available CPU features on your CPU:
$ RUSTFLAGS="-C target-cpu=native" cargo run --release
Compiling so53831502 v0.1.0 (/tmp/so53831502)
Finished release [optimized] target(s) in 0.34s
Running `target/release/so53831502`
ar: __m256(2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
br: __m256(-1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0)
abr: __m256(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
abr: __m256(-2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
ab: [-2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
However, both of these compilation commands produce binaries that can only run on CPUs that support the avx and fma features. If that's not a problem for you, then this is a fine solution. If you would instead like to build portable binaries, then you can perform CPU feature detection at runtime, and compile certain functions with specific CPU features enabled. It is then your responsibility to guarantee that said functions are only invoked when the corresponding CPU feature is enabled and available. This process is documented as part of the dynamic CPU feature detection section of the std::arch docs.
Here's an example that uses runtime CPU feature detection:
use std::arch::x86_64::*;
use std::process;
fn main() {
if is_x86_feature_detected!("avx") && is_x86_feature_detected!("fma") {
// SAFETY: This is safe because we're guaranteed to support the
// necessary CPU features.
unsafe { doit(); }
} else {
eprintln!("unsupported CPU");
process::exit(1);
}
}
#[target_feature(enable = "avx,fma")]
unsafe fn doit() {
let a = vec![2.0f32, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0];
let b = -1.0f32;
let ar = _mm256_loadu_ps(a.as_ptr());
println!("ar: {:?}", ar);
let br = _mm256_set1_ps(b);
println!("br: {:?}", br);
let mut abr = _mm256_setzero_ps();
println!("abr: {:?}", abr);
abr = _mm256_fmadd_ps(ar, br, abr);
println!("abr: {:?}", abr);
let mut ab = [0.0; 8];
_mm256_storeu_ps(ab.as_mut_ptr(), abr);
println!("ab: {:?}", ab);
assert_eq!(ab[0], -2.0f32);
}
To run it, you no longer need to set any compilation flags:
$ cargo run --release
Compiling so53831502 v0.1.0 (/tmp/so53831502)
Finished release [optimized] target(s) in 0.29s
Running `target/release/so53831502`
ar: __m256(2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
br: __m256(-1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0)
abr: __m256(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
abr: __m256(-2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
ab: [-2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
If you run the resulting binary on a CPU that doesn't support either avx or fma, then the program should exit with an error message: unsupported CPU.
In general, I think the docs for std::arch could be improved. In particular, the key boundary at which you need to split your code is dependent upon whether your vector types appear in your function signature. That is, the doit routine does not require anything beyond the standard x86 (or x86_64) function ABI to call, and is thus safe to call from functions that don't otherwise support avx or fma. However, internally, the function has been told to compile its code using additional instruction set extensions based on the given CPU features. This is achieved via the target_feature attribute. If you, for example, supplied an incorrect target feature:
#[target_feature(enable = "ssse3")]
unsafe fn doit() {
// ...
}
then the program exhibits the same behavior as your initial program.

Related

Sympy deadlock / infinite hanging when trying to invert a 24x24 matrix

I have the following 24x24 square matrix, H, that has a non 0 determinent:
Matrix([
[ 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -R5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[ 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -R4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[ 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -R11, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[ 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -R9, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[ 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -R7, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[ 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -R10, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -R8, 0.0, 0.0, 0.0, 0.0, 0.0],
[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -R6, 0.0, 0.0, 0.0, 0.0],
[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -R3, 0.0, 0.0, 0.0],
[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -R2, 0.0, 0.0],
[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -R1, 0.0],
[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -R0],
[-1.0, 0.0, 1.0, -1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[ 0.0, -1.0, 0.0, 1.0, -1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[-1.0, 0.0, 1.0, -1.0, 0.0, 1.0, -1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[ 0.0, -1.0, 0.0, 1.0, -1.0, 0.0, 1.0, -1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0],
[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0],
[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, 0.0, -1.0, 0.0],
[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, -1.0, 1.0, -1.0],
[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0],
[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, -1.0, 0.0],
[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, -1.0],
[ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0]])
Whenever I try to compute H.inv() to obtain the inverse of the matrix, python just hangs infinitely. It does not crash, nor return an error.
using sympy.inverse(H) returns H**-1 (as in it returns the full uninverted matrix and adds **-1 on the end), but if I then try to multiply this matrix by another matrix, e.g, a column vector, I get the same infinite hanging again.
What is the cause of this? Can sympy just not handle matrices this large or is there something wrong with the original matrix H?
Computing the inverse of a matrix like this will get much faster in SymPy soon. I have an implementation that can invert this in 24 seconds but that can also be improved. As suggested in the comments the new implementation is generally faster with integer/rational than float.
It takes a long time partly because sympy can be made more efficient but also because the answer is complicated even if the matrix itself looks simple. In fact I just tried to post the answer here but SO said
Body is limited to 30000 chars; you entered 2691249
so it is too complicated to show here.
Instead I'll just show the upper-left entry of the inverse matrix:
In [23]: ok[0, 0]
Out[23]:
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
R₀⋅R₁⋅R₁₁⋅R₂ + R₀⋅R₁⋅R₁₁⋅R₄ + R₀⋅R₁⋅R₁₁⋅R₇ + R₀⋅R₁⋅R₁₁⋅R₉ + R₀⋅R₁⋅R₂⋅R₃ + R₀⋅R₁⋅R₂⋅R₅ + R₀⋅R₁⋅R₂⋅R₉ + R₀⋅R₁⋅R₃⋅R₄ + R₀⋅R₁⋅R₃⋅R₇ + R₀⋅R₁⋅R₃⋅R₉ + R₀⋅R₁⋅R₄⋅R₅
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+ R₀⋅R₁⋅R₄⋅R₉ + R₀⋅R₁⋅R₅⋅R₇ + R₀⋅R₁⋅R₅⋅R₉ + R₀⋅R₁⋅R₇⋅R₉ + R₀⋅R₁₀⋅R₁₁⋅R₂ + R₀⋅R₁₀⋅R₁₁⋅R₄ + R₀⋅R₁₀⋅R₁₁⋅R₇ + R₀⋅R₁₀⋅R₁₁⋅R₉ + R₀⋅R₁₀⋅R₂⋅R₃ + R₀⋅R₁₀⋅R₂⋅R₅ + R₀⋅R
R₀⋅R₁⋅R₁₁⋅R₂ + R₀⋅R₁⋅R₁₁⋅R₄ + R₀⋅R₁⋅R₁₁⋅R₇ + R₀⋅R₁⋅R₁₁⋅R₉ + R₀⋅R₁⋅R₂
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
₁₀⋅R₂⋅R₉ + R₀⋅R₁₀⋅R₃⋅R₄ + R₀⋅R₁₀⋅R₃⋅R₇ + R₀⋅R₁₀⋅R₃⋅R₉ + R₀⋅R₁₀⋅R₄⋅R₅ + R₀⋅R₁₀⋅R₄⋅R₉ + R₀⋅R₁₀⋅R₅⋅R₇ + R₀⋅R₁₀⋅R₅⋅R₉ + R₀⋅R₁₀⋅R₇⋅R₉ + R₀⋅R₁₁⋅R₂⋅R₃ + R₀⋅R₁₁⋅R₂⋅
⋅R₃ + R₀⋅R₁⋅R₂⋅R₉ + R₀⋅R₁⋅R₃⋅R₄ + R₀⋅R₁⋅R₃⋅R₇ + R₀⋅R₁⋅R₃⋅R₉ + R₀⋅R₁⋅R₄⋅R₉ + R₀⋅R₁⋅R₇⋅R₉ + R₀⋅R₁₀⋅R₁₁⋅R₂ + R₀⋅R₁₀⋅R₁₁⋅R₄ + R₀⋅R₁₀⋅R₁₁⋅R₇ + R₀⋅R₁₀⋅R₁₁⋅R₉ + R₀
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
R₈ + R₀⋅R₁₁⋅R₃⋅R₄ + R₀⋅R₁₁⋅R₃⋅R₇ + R₀⋅R₁₁⋅R₃⋅R₉ + R₀⋅R₁₁⋅R₄⋅R₈ + R₀⋅R₁₁⋅R₇⋅R₈ + R₀⋅R₁₁⋅R₈⋅R₉ + R₀⋅R₂⋅R₃⋅R₅ + R₀⋅R₂⋅R₃⋅R₈ + R₀⋅R₂⋅R₃⋅R₉ + R₀⋅R₂⋅R₅⋅R₈ + R₀⋅R₂
⋅R₁₀⋅R₂⋅R₃ + R₀⋅R₁₀⋅R₂⋅R₉ + R₀⋅R₁₀⋅R₃⋅R₄ + R₀⋅R₁₀⋅R₃⋅R₇ + R₀⋅R₁₀⋅R₃⋅R₉ + R₀⋅R₁₀⋅R₄⋅R₉ + R₀⋅R₁₀⋅R₇⋅R₉ + R₀⋅R₁₁⋅R₂⋅R₃ + R₀⋅R₁₁⋅R₂⋅R₈ + R₀⋅R₁₁⋅R₃⋅R₄ + R₀⋅R₁₁⋅R
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
⋅R₈⋅R₉ + R₀⋅R₃⋅R₄⋅R₅ + R₀⋅R₃⋅R₄⋅R₈ + R₀⋅R₃⋅R₄⋅R₉ + R₀⋅R₃⋅R₅⋅R₇ + R₀⋅R₃⋅R₅⋅R₉ + R₀⋅R₃⋅R₇⋅R₈ + R₀⋅R₃⋅R₇⋅R₉ + R₀⋅R₃⋅R₈⋅R₉ + R₀⋅R₄⋅R₅⋅R₈ + R₀⋅R₄⋅R₈⋅R₉ + R₀⋅R₅⋅R
₃⋅R₇ + R₀⋅R₁₁⋅R₃⋅R₉ + R₀⋅R₁₁⋅R₄⋅R₈ + R₀⋅R₁₁⋅R₇⋅R₈ + R₀⋅R₁₁⋅R₈⋅R₉ + R₀⋅R₂⋅R₃⋅R₈ + R₀⋅R₂⋅R₃⋅R₉ + R₀⋅R₂⋅R₈⋅R₉ + R₀⋅R₃⋅R₄⋅R₈ + R₀⋅R₃⋅R₄⋅R₉ + R₀⋅R₃⋅R₇⋅R₈ + R₀⋅R₃
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
₇⋅R₈ + R₀⋅R₅⋅R₈⋅R₉ + R₀⋅R₇⋅R₈⋅R₉ + R₁⋅R₁₁⋅R₂⋅R₄ + R₁⋅R₁₁⋅R₂⋅R₆ + R₁⋅R₁₁⋅R₂⋅R₇ + R₁⋅R₁₁⋅R₂⋅R₈ + R₁⋅R₁₁⋅R₂⋅R₉ + R₁⋅R₁₁⋅R₄⋅R₆ + R₁⋅R₁₁⋅R₄⋅R₈ + R₁⋅R₁₁⋅R₆⋅R₇ + R
⋅R₇⋅R₉ + R₀⋅R₃⋅R₈⋅R₉ + R₀⋅R₄⋅R₈⋅R₉ + R₀⋅R₇⋅R₈⋅R₉ + R₁⋅R₁₁⋅R₂⋅R₄ + R₁⋅R₁₁⋅R₂⋅R₆ + R₁⋅R₁₁⋅R₂⋅R₇ + R₁⋅R₁₁⋅R₂⋅R₈ + R₁⋅R₁₁⋅R₂⋅R₉ + R₁⋅R₁₁⋅R₄⋅R₆ + R₁⋅R₁₁⋅R₄⋅R₈ +
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
₁⋅R₁₁⋅R₆⋅R₉ + R₁⋅R₁₁⋅R₇⋅R₈ + R₁⋅R₁₁⋅R₈⋅R₉ + R₁⋅R₂⋅R₃⋅R₄ + R₁⋅R₂⋅R₃⋅R₆ + R₁⋅R₂⋅R₃⋅R₇ + R₁⋅R₂⋅R₃⋅R₈ + R₁⋅R₂⋅R₃⋅R₉ + R₁⋅R₂⋅R₄⋅R₅ + R₁⋅R₂⋅R₄⋅R₉ + R₁⋅R₂⋅R₅⋅R₆ +
R₁⋅R₁₁⋅R₆⋅R₇ + R₁⋅R₁₁⋅R₆⋅R₉ + R₁⋅R₁₁⋅R₇⋅R₈ + R₁⋅R₁₁⋅R₈⋅R₉ + R₁⋅R₂⋅R₃⋅R₄ + R₁⋅R₂⋅R₃⋅R₆ + R₁⋅R₂⋅R₃⋅R₇ + R₁⋅R₂⋅R₃⋅R₈ + R₁⋅R₂⋅R₃⋅R₉ + R₁⋅R₂⋅R₄⋅R₉ + R₁⋅R₂⋅R₆⋅R₉
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
R₁⋅R₂⋅R₅⋅R₇ + R₁⋅R₂⋅R₅⋅R₈ + R₁⋅R₂⋅R₅⋅R₉ + R₁⋅R₂⋅R₆⋅R₉ + R₁⋅R₂⋅R₇⋅R₉ + R₁⋅R₂⋅R₈⋅R₉ + R₁⋅R₃⋅R₄⋅R₆ + R₁⋅R₃⋅R₄⋅R₈ + R₁⋅R₃⋅R₆⋅R₇ + R₁⋅R₃⋅R₆⋅R₉ + R₁⋅R₃⋅R₇⋅R₈ + R₁
+ R₁⋅R₂⋅R₇⋅R₉ + R₁⋅R₂⋅R₈⋅R₉ + R₁⋅R₃⋅R₄⋅R₆ + R₁⋅R₃⋅R₄⋅R₈ + R₁⋅R₃⋅R₆⋅R₇ + R₁⋅R₃⋅R₆⋅R₉ + R₁⋅R₃⋅R₇⋅R₈ + R₁⋅R₃⋅R₈⋅R₉ + R₁⋅R₄⋅R₆⋅R₉ + R₁⋅R₄⋅R₈⋅R₉ + R₁⋅R₆⋅R₇⋅R₉ +
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
⋅R₃⋅R₈⋅R₉ + R₁⋅R₄⋅R₅⋅R₆ + R₁⋅R₄⋅R₅⋅R₈ + R₁⋅R₄⋅R₆⋅R₉ + R₁⋅R₄⋅R₈⋅R₉ + R₁⋅R₅⋅R₆⋅R₇ + R₁⋅R₅⋅R₆⋅R₉ + R₁⋅R₅⋅R₇⋅R₈ + R₁⋅R₅⋅R₈⋅R₉ + R₁⋅R₆⋅R₇⋅R₉ + R₁⋅R₇⋅R₈⋅R₉ + R₁₀⋅
R₁⋅R₇⋅R₈⋅R₉ + R₁₀⋅R₁₁⋅R₂⋅R₄ + R₁₀⋅R₁₁⋅R₂⋅R₆ + R₁₀⋅R₁₁⋅R₂⋅R₇ + R₁₀⋅R₁₁⋅R₂⋅R₈ + R₁₀⋅R₁₁⋅R₂⋅R₉ + R₁₀⋅R₁₁⋅R₄⋅R₆ + R₁₀⋅R₁₁⋅R₄⋅R₈ + R₁₀⋅R₁₁⋅R₆⋅R₇ + R₁₀⋅R₁₁⋅R₆⋅R₉
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
R₁₁⋅R₂⋅R₄ + R₁₀⋅R₁₁⋅R₂⋅R₆ + R₁₀⋅R₁₁⋅R₂⋅R₇ + R₁₀⋅R₁₁⋅R₂⋅R₈ + R₁₀⋅R₁₁⋅R₂⋅R₉ + R₁₀⋅R₁₁⋅R₄⋅R₆ + R₁₀⋅R₁₁⋅R₄⋅R₈ + R₁₀⋅R₁₁⋅R₆⋅R₇ + R₁₀⋅R₁₁⋅R₆⋅R₉ + R₁₀⋅R₁₁⋅R₇⋅R₈ +
+ R₁₀⋅R₁₁⋅R₇⋅R₈ + R₁₀⋅R₁₁⋅R₈⋅R₉ + R₁₀⋅R₂⋅R₃⋅R₄ + R₁₀⋅R₂⋅R₃⋅R₆ + R₁₀⋅R₂⋅R₃⋅R₇ + R₁₀⋅R₂⋅R₃⋅R₈ + R₁₀⋅R₂⋅R₃⋅R₉ + R₁₀⋅R₂⋅R₄⋅R₉ + R₁₀⋅R₂⋅R₆⋅R₉ + R₁₀⋅R₂⋅R₇⋅R₉ + R₁
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
R₁₀⋅R₁₁⋅R₈⋅R₉ + R₁₀⋅R₂⋅R₃⋅R₄ + R₁₀⋅R₂⋅R₃⋅R₆ + R₁₀⋅R₂⋅R₃⋅R₇ + R₁₀⋅R₂⋅R₃⋅R₈ + R₁₀⋅R₂⋅R₃⋅R₉ + R₁₀⋅R₂⋅R₄⋅R₅ + R₁₀⋅R₂⋅R₄⋅R₉ + R₁₀⋅R₂⋅R₅⋅R₆ + R₁₀⋅R₂⋅R₅⋅R₇ + R₁₀⋅R
₀⋅R₂⋅R₈⋅R₉ + R₁₀⋅R₃⋅R₄⋅R₆ + R₁₀⋅R₃⋅R₄⋅R₈ + R₁₀⋅R₃⋅R₆⋅R₇ + R₁₀⋅R₃⋅R₆⋅R₉ + R₁₀⋅R₃⋅R₇⋅R₈ + R₁₀⋅R₃⋅R₈⋅R₉ + R₁₀⋅R₄⋅R₆⋅R₉ + R₁₀⋅R₄⋅R₈⋅R₉ + R₁₀⋅R₆⋅R₇⋅R₉ + R₁₀⋅R₇⋅R
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
₂⋅R₅⋅R₈ + R₁₀⋅R₂⋅R₅⋅R₉ + R₁₀⋅R₂⋅R₆⋅R₉ + R₁₀⋅R₂⋅R₇⋅R₉ + R₁₀⋅R₂⋅R₈⋅R₉ + R₁₀⋅R₃⋅R₄⋅R₆ + R₁₀⋅R₃⋅R₄⋅R₈ + R₁₀⋅R₃⋅R₆⋅R₇ + R₁₀⋅R₃⋅R₆⋅R₉ + R₁₀⋅R₃⋅R₇⋅R₈ + R₁₀⋅R₃⋅R₈⋅R
₈⋅R₉ + R₁₁⋅R₂⋅R₃⋅R₄ + R₁₁⋅R₂⋅R₃⋅R₆ + R₁₁⋅R₂⋅R₃⋅R₇ + R₁₁⋅R₂⋅R₃⋅R₈ + R₁₁⋅R₂⋅R₃⋅R₉ + R₁₁⋅R₂⋅R₄⋅R₈ + R₁₁⋅R₂⋅R₆⋅R₈ + R₁₁⋅R₂⋅R₇⋅R₈ + R₁₁⋅R₂⋅R₈⋅R₉ + R₁₁⋅R₃⋅R₄⋅R₆ +
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
₉ + R₁₀⋅R₄⋅R₅⋅R₆ + R₁₀⋅R₄⋅R₅⋅R₈ + R₁₀⋅R₄⋅R₆⋅R₉ + R₁₀⋅R₄⋅R₈⋅R₉ + R₁₀⋅R₅⋅R₆⋅R₇ + R₁₀⋅R₅⋅R₆⋅R₉ + R₁₀⋅R₅⋅R₇⋅R₈ + R₁₀⋅R₅⋅R₈⋅R₉ + R₁₀⋅R₆⋅R₇⋅R₉ + R₁₀⋅R₇⋅R₈⋅R₉ + R₁
R₁₁⋅R₃⋅R₄⋅R₈ + R₁₁⋅R₃⋅R₆⋅R₇ + R₁₁⋅R₃⋅R₆⋅R₉ + R₁₁⋅R₃⋅R₇⋅R₈ + R₁₁⋅R₃⋅R₈⋅R₉ + R₁₁⋅R₄⋅R₆⋅R₈ + R₁₁⋅R₆⋅R₇⋅R₈ + R₁₁⋅R₆⋅R₈⋅R₉ + R₂⋅R₃⋅R₄⋅R₈ + R₂⋅R₃⋅R₄⋅R₉ + R₂⋅R₃⋅R
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
₁⋅R₂⋅R₃⋅R₄ + R₁₁⋅R₂⋅R₃⋅R₆ + R₁₁⋅R₂⋅R₃⋅R₇ + R₁₁⋅R₂⋅R₃⋅R₈ + R₁₁⋅R₂⋅R₃⋅R₉ + R₁₁⋅R₂⋅R₄⋅R₈ + R₁₁⋅R₂⋅R₆⋅R₈ + R₁₁⋅R₂⋅R₇⋅R₈ + R₁₁⋅R₂⋅R₈⋅R₉ + R₁₁⋅R₃⋅R₄⋅R₆ + R₁₁⋅R₃⋅R
₆⋅R₈ + R₂⋅R₃⋅R₆⋅R₉ + R₂⋅R₃⋅R₇⋅R₈ + R₂⋅R₃⋅R₇⋅R₉ + R₂⋅R₄⋅R₈⋅R₉ + R₂⋅R₆⋅R₈⋅R₉ + R₂⋅R₇⋅R₈⋅R₉ + R₃⋅R₄⋅R₆⋅R₈ + R₃⋅R₄⋅R₆⋅R₉ + R₃⋅R₄⋅R₈⋅R₉ + R₃⋅R₆⋅R₇⋅R₈ + R₃⋅R₆⋅R₇⋅
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
₄⋅R₈ + R₁₁⋅R₃⋅R₆⋅R₇ + R₁₁⋅R₃⋅R₆⋅R₉ + R₁₁⋅R₃⋅R₇⋅R₈ + R₁₁⋅R₃⋅R₈⋅R₉ + R₁₁⋅R₄⋅R₆⋅R₈ + R₁₁⋅R₆⋅R₇⋅R₈ + R₁₁⋅R₆⋅R₈⋅R₉ + R₂⋅R₃⋅R₄⋅R₅ + R₂⋅R₃⋅R₄⋅R₈ + R₂⋅R₃⋅R₄⋅R₉ + R₂
R₉ + R₃⋅R₆⋅R₈⋅R₉ + R₃⋅R₇⋅R₈⋅R₉ + R₄⋅R₆⋅R₈⋅R₉ + R₆⋅R₇⋅R₈⋅R₉
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
⋅R₃⋅R₅⋅R₆ + R₂⋅R₃⋅R₅⋅R₇ + R₂⋅R₃⋅R₅⋅R₈ + R₂⋅R₃⋅R₅⋅R₉ + R₂⋅R₃⋅R₆⋅R₈ + R₂⋅R₃⋅R₆⋅R₉ + R₂⋅R₃⋅R₇⋅R₈ + R₂⋅R₃⋅R₇⋅R₉ + R₂⋅R₄⋅R₅⋅R₈ + R₂⋅R₄⋅R₈⋅R₉ + R₂⋅R₅⋅R₆⋅R₈ + R₂⋅R
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
₅⋅R₇⋅R₈ + R₂⋅R₅⋅R₈⋅R₉ + R₂⋅R₆⋅R₈⋅R₉ + R₂⋅R₇⋅R₈⋅R₉ + R₃⋅R₄⋅R₅⋅R₆ + R₃⋅R₄⋅R₅⋅R₈ + R₃⋅R₄⋅R₆⋅R₈ + R₃⋅R₄⋅R₆⋅R₉ + R₃⋅R₄⋅R₈⋅R₉ + R₃⋅R₅⋅R₆⋅R₇ + R₃⋅R₅⋅R₆⋅R₉ + R₃⋅R₅⋅
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
R₇⋅R₈ + R₃⋅R₅⋅R₈⋅R₉ + R₃⋅R₆⋅R₇⋅R₈ + R₃⋅R₆⋅R₇⋅R₉ + R₃⋅R₆⋅R₈⋅R₉ + R₃⋅R₇⋅R₈⋅R₉ + R₄⋅R₅⋅R₆⋅R₈ + R₄⋅R₆⋅R₈⋅R₉ + R₅⋅R₆⋅R₇⋅R₈ + R₅⋅R₆⋅R₈⋅R₉ + R₆⋅R₇⋅R₈⋅R₉

How to write a custom loss function in LGBM?

I have a binary cross-entropy implementation in Keras. I would like to implement the same one in LGBM as a custom loss. Now I understand LGBM of course has 'binary' objective built-in but I would like to implement this one custom-made on my own as a starter for some future enhancements.
Here is the code,
def custom_binary_loss(y_true, y_pred):
"""
Keras version of binary cross-entropy (works like charm!)
"""
# https://github.com/tensorflow/tensorflow/blob/v2.3.1/tensorflow/python/keras/backend.py#L4826
y_pred = K.clip(y_pred, K.epsilon(), 1 - K.epsilon())
term_0 = (1 - y_true) * K.log(1 - y_pred + K.epsilon()) # Cancels out when target is 1
term_1 = y_true * K.log(y_pred + K.epsilon()) # Cancels out when target is 0
return -K.mean(term_0 + term_1, axis=1)
# --------------------
def custom_binary_loss_lgbm(y_pred, train_data):
"""
LGBM version of binary cross-entropy
"""
y_pred = 1.0 / (1.0 + np.exp(-y_pred))
y_true = train_data.get_label()
y_true = np.expand_dims(y_true, axis=1)
y_pred = np.expand_dims(y_pred, axis=1)
epsilon_ = 1e-7
y_pred = np.clip(y_pred, epsilon_, 1 - epsilon_)
term_0 = (1 - y_true) * np.log(1 - y_pred + epsilon_) # Cancels out when target is 1
term_1 = y_true * np.log(y_pred + epsilon_) # Cancels out when target is 0
grad = -np.mean(term_0 + term_1, axis=1)
hess = np.ones(grad.shape)
return grad, hess
But using the above my LGBM model only predicts zeros. Now my dataset is balanced and everything looks cool so what's the error here?
params = {
'objective': 'binary',
'num_iterations': 100,
'seed': 21
}
ds_train = lgb.Dataset(df_train[predictors], y, free_raw_data=False)
reg_lgbm = lgb.train(params=params, train_set=ds_train, fobj=custom_binary_loss_lgbm)
I also tried with a different hessian hess = (y_pred * (1. - y_pred)).flatten(). Although I don't know what hessian really means it didn't work either!
list(map(lambda x: 1.0 / (1.0 + np.exp(-x)), reg_lgbm.predict(df_train[predictors])))
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, .............]
Try setting the metric parameter to the string "None" in params, like this:
params = {
'objective': 'binary',
'metric': 'None',
'num_iterations': 100,
'seed': 21
}
Otherwise, according to the documentation, the algorithm would choose a default evaluation method for objective set to 'binary'

can some one help to fit the array in kmeans clustering

when i try to fit it in kmeans clustering it throws error "ValueError: setting an array element with a sequence."
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=5)
kmeans.fit(df)
Array decription.
Name: Vector, Length: 179, dtype: object
0 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...
1 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...
10 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...
100 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...
101 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...
Your column has a list in it. It needs to be opened up into multiple columns before passing it to KMeans.
df = pd.read_json('/Users/roshansk/Downloads/NewsArticles.json')
#Extracting the vectors into columns
vectors = df.Vector.apply(pd.Seriesies)
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=5)
kmeans.fit(vectors)

sympy solve() gives implicit/incorrect answer

I'm trying to solve an equation system with 16 equations and 16 unknowns using sympy but it doesn't seem to solve it well.
I want to solve the system [K][d]=[f] where [K] is the coefficients matrix, [d] the unknowns and [f] are constants. I know some unknowns "d" and some constants "f", so I have same number for both equations and unknowns, but when I substitute these values into the equations and try to solve it the results for all "dx" include "dx8". I checked the matrix determinant and is positive so I should get a unique answer.
Here is the code:
import sympy as sp
import numpy as np
K = np.array([[560000000.0, 0.0, -480000000.0, 80000000.0, 0.0, 0.0, 0.0, 0.0,-80000000.0, 120000000.0, 0.0, -200000000.0, 0.0, 0.0, 0.0, 0.0],
[0.0, 393333333.3, 120000000.0, -180000000.0, 0.0, 0.0, 0.0, 0.0,80000000.0, -213333333.3, -200000000.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[-480000000.0, 120000000.0, 1120000000.0, -200000000.0,-480000000.0, 80000000.0, 0.0, 0.0, 0.0, 0.0, -160000000.0,200000000.0, 0.0, -200000000.0, 0.0, 0.0],
[80000000.0, -180000000.0, -200000000.0, 786666666.7, 120000000.0,-180000000.0, 0.0, 0.0, 0.0, 0.0, 200000000.0, -426666666.7,-200000000.0, 0.0, 0.0, 0.0],
[0.0, 0.0, -480000000.0, 120000000.0, 1120000000.0, -200000000.0,-480000000.0, 80000000.0, 0.0, 0.0, 0.0, 0.0, -160000000.0,200000000.0, 0.0, -200000000.0],
[0.0, 0.0, 80000000.0, -180000000.0, -200000000.0, 786666666.7,120000000.0, -180000000.0, 0.0, 0.0, 0.0, 0.0, 200000000.0,-426666666.7, -200000000.0, 0.0],
[0.0, 0.0, 0.0, 0.0, -480000000.0, 120000000.0, 560000000.0,-200000000.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -80000000.0, 80000000.0],
[0.0, 0.0, 0.0, 0.0, 80000000.0, -180000000.0, -200000000.0,393333333.3, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 120000000.0,-213333333.3],
[-80000000.0, 80000000.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 560000000.0,-200000000.0, -480000000.0, 120000000.0, 0.0, 0.0, 0.0, 0.0],
[120000000.0, -213333333.3, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,-200000000.0, 393333333.3, 80000000.0, -180000000.0, 0.0, 0.0, 0.0,0.0],
[0.0, -200000000.0, -160000000.0, 200000000.0, 0.0, 0.0, 0.0, 0.0,-480000000.0, 80000000.0, 1120000000.0, -200000000.0, -480000000.0,120000000.0, 0.0, 0.0],
[-200000000.0, 0.0, 200000000.0, -426666666.7, 0.0, 0.0, 0.0, 0.0,120000000.0, -180000000.0, -200000000.0, 786666666.7, 80000000.0,-180000000.0, 0.0, 0.0],
[0.0, 0.0, 0.0, -200000000.0, -160000000.0, 200000000.0, 0.0, 0.0,0.0, 0.0, -480000000.0, 80000000.0, 1120000000.0, -200000000.0,-480000000.0, 120000000.0],
[0.0, 0.0, -200000000.0, 0.0, 200000000.0, -426666666.7, 0.0, 0.0,0.0, 0.0, 120000000.0, -180000000.0, -200000000.0, 786666666.7,80000000.0, -180000000.0],
[0.0, 0.0, 0.0, 0.0, 0.0, -200000000.0, -80000000.0, 120000000.0,0.0, 0.0, 0.0, 0.0, -480000000.0, 80000000.0, 560000000.0, 0.0],
[0.0, 0.0, 0.0, 0.0, -200000000.0, 0.0, 80000000.0, -213333333.3,0.0, 0.0, 0.0, 0.0, 120000000.0, -180000000.0, 0.0, 393333333.3]])
x = [sp.var('dx'+ str(i+1)) for i in range(8)]
y = [sp.var('dy'+ str(i+1)) for i in range(8)]
fx = [sp.var('fx'+ str(i+1)) for i in range(8)]
fy = [sp.var('fy'+ str(i+1)) for i in range(8)]
xy = list(sum(zip(x, y), ()))
fxy = list(sum(zip(fx, fy), ()))
M = sp.Matrix(K)*sp.Matrix(xy)
Ec = [sp.Eq(M[i], fxy[i]) for i in range(16)]
#known values
d_kwn = [(dy1, 0), (dy2, 0), (dy3, 0), (dy4, 0)]
f_kwn = [(fx5, 0), (fy5, 0), (fx6, 0), (fy6, -3000), (fx7, 0), (fy7, -3000),(fx8, 0), (fy8, 0), (fx1, 0), (fx2, 0), (fx3, 0), (fx4, 0)]
for var in d_kwn:
for i, eq in enumerate(Ec):
Ec[i] = eq.subs(var[0], var[1])
for var in f_kwn:
for i, eq in enumerate(Ec):
Ec[i] = eq.subs(var[0], var[1])
Sols = sp.solvers.solve(Ec)
sp.Matrix(sorted(Sols.items(), key=str))
And this is the output I'm getting:
{dx1: dx8−3.54468009860439⋅10−6,
dx2: dx8−1.8414987360977⋅10−6,
dx3: dx8−2.11496606381994⋅10−7,
dx4: dx8+2.05943267588118⋅10−7,
dx5: dx8−1.24937663359153⋅10−6,
dx6: dx8−1.55655946713284⋅10−6,
dx7: dx8−1.08797652070783⋅10−6,
dy5: −2.10639657360695⋅10−6,
dy6: −6.26959460018537⋅10−6,
dy7: −6.32191585665888⋅10−6,
dy8: −2.7105825114088⋅10−6,
fy1: 439.746516706791,
fy2: 2640.65618690176,
fy3: 2399.44807607611,
fy4: 520.14922031534}
I don't know why I'm not getting a result for dx8. I tried adding more equations because theoretically: dx1 = dx4, dx2 = dx3, dx5 = dx8, dx6 = dx7 and so on. But it gives me and empty list.
Any help will be appreciated.
If you need to use Sympy, then the following may work. First we can solve the reduced system of equations only for unknown d values. Then once we know all d values we can calculate the unknown f values by doing [K][d]=[f] for only the unknown f equation numbers (not implemented in the code below).
import sympy as sp
import numpy as np
K = np.array([[560000000.0, 0.0, -480000000.0, 80000000.0, 0.0, 0.0, 0.0, 0.0,-80000000.0, 120000000.0, 0.0, -200000000.0, 0.0, 0.0, 0.0, 0.0],
[0.0, 393333333.3, 120000000.0, -180000000.0, 0.0, 0.0, 0.0, 0.0,80000000.0, -213333333.3, -200000000.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[-480000000.0, 120000000.0, 1120000000.0, -200000000.0,-480000000.0, 80000000.0, 0.0, 0.0, 0.0, 0.0, -160000000.0,200000000.0, 0.0, -200000000.0, 0.0, 0.0],
[80000000.0, -180000000.0, -200000000.0, 786666666.7, 120000000.0,-180000000.0, 0.0, 0.0, 0.0, 0.0, 200000000.0, -426666666.7,-200000000.0, 0.0, 0.0, 0.0],
[0.0, 0.0, -480000000.0, 120000000.0, 1120000000.0, -200000000.0,-480000000.0, 80000000.0, 0.0, 0.0, 0.0, 0.0, -160000000.0,200000000.0, 0.0, -200000000.0],
[0.0, 0.0, 80000000.0, -180000000.0, -200000000.0, 786666666.7,120000000.0, -180000000.0, 0.0, 0.0, 0.0, 0.0, 200000000.0,-426666666.7, -200000000.0, 0.0],
[0.0, 0.0, 0.0, 0.0, -480000000.0, 120000000.0, 560000000.0,-200000000.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -80000000.0, 80000000.0],
[0.0, 0.0, 0.0, 0.0, 80000000.0, -180000000.0, -200000000.0,393333333.3, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 120000000.0,-213333333.3],
[-80000000.0, 80000000.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 560000000.0,-200000000.0, -480000000.0, 120000000.0, 0.0, 0.0, 0.0, 0.0],
[120000000.0, -213333333.3, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,-200000000.0, 393333333.3, 80000000.0, -180000000.0, 0.0, 0.0, 0.0,0.0],
[0.0, -200000000.0, -160000000.0, 200000000.0, 0.0, 0.0, 0.0, 0.0,-480000000.0, 80000000.0, 1120000000.0, -200000000.0, -480000000.0,120000000.0, 0.0, 0.0],
[-200000000.0, 0.0, 200000000.0, -426666666.7, 0.0, 0.0, 0.0, 0.0,120000000.0, -180000000.0, -200000000.0, 786666666.7, 80000000.0,-180000000.0, 0.0, 0.0],
[0.0, 0.0, 0.0, -200000000.0, -160000000.0, 200000000.0, 0.0, 0.0,0.0, 0.0, -480000000.0, 80000000.0, 1120000000.0, -200000000.0,-480000000.0, 120000000.0],
[0.0, 0.0, -200000000.0, 0.0, 200000000.0, -426666666.7, 0.0, 0.0,0.0, 0.0, 120000000.0, -180000000.0, -200000000.0, 786666666.7,80000000.0, -180000000.0],
[0.0, 0.0, 0.0, 0.0, 0.0, -200000000.0, -80000000.0, 120000000.0,0.0, 0.0, 0.0, 0.0, -480000000.0, 80000000.0, 560000000.0, 0.0],
[0.0, 0.0, 0.0, 0.0, -200000000.0, 0.0, 80000000.0, -213333333.3,0.0, 0.0, 0.0, 0.0, 120000000.0, -180000000.0, 0.0, 393333333.3]])
x = [sp.var('dx'+ str(i+1)) for i in range(8)]
y = [sp.var('dy'+ str(i+1)) for i in range(8)]
fx = [sp.var('fx'+ str(i+1)) for i in range(8)]
fy = [sp.var('fy'+ str(i+1)) for i in range(8)]
xy = list(sum(zip(x, y), ()))
fxy = list(sum(zip(fx, fy), ()))
M = sp.Matrix(K)*sp.Matrix(xy)
Ec = [sp.Eq(M[i], fxy[i]) for i in range(16)]
#known values
d_kwn = [(dy1, 0), (dy2, 0), (dy3, 0), (dy4, 0)]
f_kwn = [(fx5, 0), (fy5, 0), (fx6, 0), (fy6, -3000), (fx7, 0), (fy7, -3000),(fx8, 0), (fy8, 0), (fx1, 0), (fx2, 0), (fx3, 0), (fx4, 0)]
for var in d_kwn:
for i, eq in enumerate(Ec):
Ec[i] = eq.subs(var[0], var[1])
for var in f_kwn:
for i, eq in enumerate(Ec):
Ec[i] = eq.subs(var[0], var[1])
Ec_part = []
for i in [0,2,4,6,8,9,10,11,12,13,14,15]:
Ec_part.append(Ec[i])
unknwns = [*x, *y[4:8]]
Sols = sp.linsolve(Ec_part,unknwns)
Sols = next( iter(Sols) )
#sp.Matrix(sorted(Sols.items(), key=str))
It is convenient to solve system of linear equations in Numpy itself. The type of system you are solving appears in Finite Element Analysis often with boundary conditions. Is it fine if we only use Numpy? If yes, the following code will do the job. We already know which elements of f and d are known we can use Numpy array indexing to solve the reduced set of equations as follows:
import numpy as np
# The NxN Coefficients matrix
K = np.array([[560000000.0, 0.0, -480000000.0, 80000000.0, 0.0, 0.0, 0.0, 0.0,-80000000.0, 120000000.0, 0.0, -200000000.0, 0.0, 0.0, 0.0, 0.0],
[0.0, 393333333.3, 120000000.0, -180000000.0, 0.0, 0.0, 0.0, 0.0,80000000.0, -213333333.3, -200000000.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[-480000000.0, 120000000.0, 1120000000.0, -200000000.0,-480000000.0, 80000000.0, 0.0, 0.0, 0.0, 0.0, -160000000.0,200000000.0, 0.0, -200000000.0, 0.0, 0.0],
[80000000.0, -180000000.0, -200000000.0, 786666666.7, 120000000.0,-180000000.0, 0.0, 0.0, 0.0, 0.0, 200000000.0, -426666666.7,-200000000.0, 0.0, 0.0, 0.0],
[0.0, 0.0, -480000000.0, 120000000.0, 1120000000.0, -200000000.0,-480000000.0, 80000000.0, 0.0, 0.0, 0.0, 0.0, -160000000.0,200000000.0, 0.0, -200000000.0],
[0.0, 0.0, 80000000.0, -180000000.0, -200000000.0, 786666666.7,120000000.0, -180000000.0, 0.0, 0.0, 0.0, 0.0, 200000000.0,-426666666.7, -200000000.0, 0.0],
[0.0, 0.0, 0.0, 0.0, -480000000.0, 120000000.0, 560000000.0,-200000000.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -80000000.0, 80000000.0],
[0.0, 0.0, 0.0, 0.0, 80000000.0, -180000000.0, -200000000.0,393333333.3, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 120000000.0,-213333333.3],
[-80000000.0, 80000000.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 560000000.0,-200000000.0, -480000000.0, 120000000.0, 0.0, 0.0, 0.0, 0.0],
[120000000.0, -213333333.3, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,-200000000.0, 393333333.3, 80000000.0, -180000000.0, 0.0, 0.0, 0.0,0.0],
[0.0, -200000000.0, -160000000.0, 200000000.0, 0.0, 0.0, 0.0, 0.0,-480000000.0, 80000000.0, 1120000000.0, -200000000.0, -480000000.0,120000000.0, 0.0, 0.0],
[-200000000.0, 0.0, 200000000.0, -426666666.7, 0.0, 0.0, 0.0, 0.0,120000000.0, -180000000.0, -200000000.0, 786666666.7, 80000000.0,-180000000.0, 0.0, 0.0],
[0.0, 0.0, 0.0, -200000000.0, -160000000.0, 200000000.0, 0.0, 0.0,0.0, 0.0, -480000000.0, 80000000.0, 1120000000.0, -200000000.0,-480000000.0, 120000000.0],
[0.0, 0.0, -200000000.0, 0.0, 200000000.0, -426666666.7, 0.0, 0.0,0.0, 0.0, 120000000.0, -180000000.0, -200000000.0, 786666666.7,80000000.0, -180000000.0],
[0.0, 0.0, 0.0, 0.0, 0.0, -200000000.0, -80000000.0, 120000000.0,0.0, 0.0, 0.0, 0.0, -480000000.0, 80000000.0, 560000000.0, 0.0],
[0.0, 0.0, 0.0, 0.0, -200000000.0, 0.0, 80000000.0, -213333333.3,0.0, 0.0, 0.0, 0.0, 120000000.0, -180000000.0, 0.0, 393333333.3]])
# A logical array for indexing
N = K.shape[0] # The number of columns in K
N_2 = int(N/2);
# Prepare the 'f'
fx = np.zeros( N_2 );
fy = np.zeros( N_2 );
fx[ [0,1,2,3,4,5,6,7] ] = np.array([0]*N_2) # Known values of fx
fy[ [4,5,6,7] ] = np.array([0,-3000,-3000,0])
f = np.concatenate( (fx,fy) )
# Solve for the unknown equations only
d = np.zeros( N )
rows = np.array([0,1,2,3,4,5,6,7,12,13,14,15])
rows = rows[:, np.newaxis]
columns = np.array([0,1,2,3,4,5,6,7,12,13,14,15])
d[ columns ] = np.linalg.solve( K[ rows, columns ], f[ columns ] )
# Calculate unknown f values
f[ [8,9,10,11] ] = K[ [8,9,10,11], [8,9,10,11] ]*d[[8,9,10,11]]

Gaussian elimination with partial pivoting (column)

I cannot find out the mistake I made, could anyone help me? Thanks very much!
import math
def GASSEM():
a0 = [12,-2,1,0,0,0,0,0,0,0,13.97]
a1 = [-2,12,-2,1,0,0,0,0,0,0,5.93]
a2 = [1,-2,12,-2,1,0,0,0,0,0,-6.02]
a3 = [0,1,-2,12,-2,1,0,0,0,0,8.32]
a4 = [0,0,1,-2,12,-2,1,0,0,0,-23.75]
a5 = [0,0,0,1,-2,12,-2,1,0,0,28.45]
a6 = [0,0,0,0,1,-2,12,-2,1,0,-8.9]
a7 = [0,0,0,0,0,1,-2,12,-2,1,-10.5]
a8 = [0,0,0,0,0,0,1,-2,12,-2,10.34]
a9 = [0,0,0,0,0,0,0,1,-2,12,-38.74]
A = [a0,a1,a2,a3,a4,a5,a6,a7,a8,a9] # 10x11 matrix
interchange=[0,0,0,0,0,0,0,0,0,0,0]
for i in range (1,10):
median = abs(A[i-1][i-1])
for m in range (i,10): #pivoting
if abs(A[m][i-1]) > median:
median = abs(A[m][i-1])
interchange = A[i-1]
A[i-1] = A[m]
A[m] = interchange
for j in range(i,10): #creating upper triangle matrix
A[j] = [A[j][k]-(A[j][i-1]/A[i-1][i-1])*A[i-1][k] for k in range(0,11)]
for t in range (0,10): #print the upper triangle matrix
print(A[t])
The output is not an upper triangle matrix, I'm getting lost in the for loops...
When I run this code, the output is
[12, -2, 1, 0, 0, 0, 0, 0, 0, 0, 13.97]
[0.0, 11.666666666666666, -1.8333333333333333, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 8.258333333333333]
[0.0, 0.0, 11.628571428571428, -1.842857142857143, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, -5.886428571428571]
[0.0, 0.0, -2.220446049250313e-16, 11.622235872235873, -1.8415233415233416, 1.0, 0.0, 0.0, 0.0, 0.0, 6.679281326781327]
[0.0, 0.0, -3.518258683818212e-17, 0.0, 11.622218698800275, -1.8415517150256329, 1.0, 0.0, 0.0, 0.0, -22.185475397706252]
[0.0, 0.0, 1.3530439218911067e-17, 0.0, 0.0, 11.62216239813737, -1.841549039580908, 1.0, 0.0, 0.0, 24.359991632712457]
[0.0, 0.0, 5.171101701700419e-18, 0.0, 0.0, 0.0, 11.622161705324444, -1.84154850220678, 1.0, 0.0, -3.131238144426707]
[0.0, 0.0, -3.448243038110395e-19, 0.0, 0.0, 0.0, 0.0, 11.62216144141611, -1.8415485389982904, 1.0, -13.0921440313208]
[0.0, 0.0, -4.995725026226573e-19, 0.0, 0.0, 0.0, 0.0, 0.0, 11.622161418001749, -1.8415485322346454, 8.534950160892514]
[0.0, 0.0, -4.9488445836100553e-20, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 11.622161417603511, -36.26114362292296]
This effectively is upper triangular. The absolute value of the 'non-zero' entries in the third column of the lower triangle are all less than 10e-15. Given that other values are 1 or greater, these small numbers look like floating point subtraction errors in A[j][k] - (A[j][i-1]/A[i-1][i-1])*A[i-1][k] that can be considered to be 0. Without more investigation, I don't know why the non-zero values are limited to this column.
For this data, the condition abs(A[m][i-1]) > median is never true, so the if block code is not tested.

Resources