Getting all dihedral angles in Pymol - pymol

I want to get all the dihedral angles of a protein in Pymol (phi, psi, chi1, chi2, chi3, chi4) but I only manage to find a function that can shows me the phi and psi.
For instance:
PyMOL>phi_psi 1a11
SER-2: ( 67.5, 172.8 )
GLU-3: ( -59.6, -19.4 )
LYS-4: ( -66.4, -61.7 )
MET-5: ( -64.1, -17.9 )
SER-6: ( -78.3, -33.7 )
THR-7: ( -84.0, -18.1 )
ALA-8: ( -85.7, -40.8 )
ILE-9: ( -75.1, -30.8 )
SER-10: ( -77.6, -47.0 )
VAL-11: ( -61.3, -27.4 )
LEU-12: ( -60.7, -47.5 )
LEU-13: ( -71.1, -38.6 )
ALA-14: ( -46.2, -50.7 )
GLN-15: ( -69.1, -47.4 )
ALA-16: ( -41.9, -52.6 )
VAL-17: ( -82.6, -23.7 )
PHE-18: ( -53.4, -63.4 )
LEU-19: ( -61.2, -30.4 )
LEU-20: ( -61.1, -32.3 )
LEU-21: ( -80.6, -60.1 )
THR-22: ( -45.9, -34.4 )
SER-23: ( -74.5, -47.8 )
GLN-24: ( -83.5, 11.0 )
It's missing the chiral angles. Does anyone know how to get all the dihedral angles?
Many thanks!

You can get arbitrary dihedral angles with get_dihedral. Create four selections, each with a single atom and then use it like this:
get_dihedral s1, s2, s3, s4
It's exposed to the Python API as cmd.get_dihedral(). I suggest writing a Python script that uses this function along with cmd.iterate() to loop over residues. Create a dict so that on each residue you can look up a list of atom quadruples that define the chi-angles.

You can easily do it in R. This is the link containing information on how to calculate the main chain and side chain Torsion/Dihedral Angles:
http://thegrantlab.org/bio3d/html/torsion.pdb.html
But first you have to install the Bio3D package for R: http://thegrantlab.org/bio3d/download
After installing the package, load it by typing library(bio3d) at the R console prompt.
>library(bio3d)
This R script answers your question:
#returns the file path of the current working directory.
getwd()
#sets the working directory to where you want.
setwd("home/R/Rscripts")
#fetches the pdb file from the protein data bank and saves to dataframe 'pb'
pb <- read.pdb("insert PDB ID")
#trim to protein only
pb.prot <- trim.pdb(pb, "protein")
#calculates the torsion angles of the protein and save to dataframe 'tor'
tor <- torsion.pdb(pb.prot)
#to get the number of rows and columns of 'tor'
dim(tor$tbl)
#identify each row by their chain, residue ID and residue Number obtained from your PDB entry
res_label <- paste(pb.prot$atom$chain[pb.prot$calpha], pb.prot$atom$resid[pb.prot$calpha], pb.prot$atom$resno[pb.prot$calpha], sep="-")
rownames(tor$tbl) <- res_label
#creates a table of the torsion angle
torsion <- tor$tbl
#For example, to look at the angles for VAL, residue 223 from chain A
tor$tbl["A-VAL-223",]
#writes out the table to a file
write.table(torsion, file = "torsion_angles.txt", quote = F, sep = "\t")
Your output file which is saved in your working directory will contain a table of the chain-resID-resNo and their corresponding phi, psi, chi1, chi2, chi3, chi4, and chi5 values. Goodluck!

#install bio3d library and call
library(bio3d)
#returns the file path of the current working directory.
getwd()
#sets the working directory to where you want.
setwd("home/R/Rscripts")
#fetches the pdb file from the protein data bank and saves to dataframe 'pb'
pb <- read.pdb("insert PDB ID")
#trim to protein only
pb.prot <- trim.pdb(pb, "protein")
#calculates the torsion angles of the protein and save to dataframe 'tor'
tor <- torsion.pdb(pb.prot)
#to get the number of rows and columns of 'tor'
dim(tor$tbl)
#identify each row by their chain, residue ID and residue Number obtained from your PDB entry
res_label <- paste(pb.prot$atom$chain[pb.prot$calpha], pb.prot$atom$resid[pb.prot$calpha], pb.prot$atom$resno[pb.prot$calpha], sep="-")
rownames(tor$tbl) <- res_label
#creates a table of the torsion angle
torsion <- tor$tbl
#For example, to look at the angles for VAL, residue 223 from chain A
tor$tbl["A-GLY-65",]
#convert "double" into a datatype
dataframe_df=as.data.frame.matrix(torsion)
#write dataframe to a .csv file
write.csv(dataframe_df, file="name.csv", row.names=TRUE,col.names=TRUE)

Related

How to find the shortest distance between two line segments capturing the sign values with python

I have a pandas dataframe of the form:
benchmark_x benchmark_y ref_point_x ref_point_y
0 525039.140 175445.518 525039.145 175445.539
1 525039.022 175445.542 525039.032 175445.568
2 525038.944 175445.558 525038.954 175445.588
3 525038.855 175445.576 525038.859 175445.576
4 525038.797 175445.587 525038.794 175445.559
5 525038.689 175445.609 525038.679 175445.551
6 525038.551 175445.637 525038.544 175445.577
7 525038.473 175445.653 525038.459 175445.594
8 525038.385 175445.670 525038.374 175445.610
9 525038.306 175445.686 525038.289 175445.626
I am trying to find the shortest distance from the line to the benchmark such that if the line is above the benchmark the distance is positive and if it is below the benchmark the distance is negative. See image below:
I used the KDTree from scipy like so:
from scipy.spatial import KDTree
tree=KDTree(df[["benchmark_x", "benchmark_y"]])
test = df.apply(lambda row: tree.query(row[["ref_point_x", "ref_point_y"]]), axis=1)
test=test.apply(pd.Series, index=["distance", "index"])
This seems to work except that it fails to capture the negative values as a result that the line is below the benchmark.
# recreating your example
columns = "benchmark_x benchmark_y ref_point_x ref_point_y".split(" ")
data = """525039.140 175445.518 525039.145 175445.539
525039.022 175445.542 525039.032 175445.568
525038.944 175445.558 525038.954 175445.588
525038.855 175445.576 525038.859 175445.576
525038.797 175445.587 525038.794 175445.559
525038.689 175445.609 525038.679 175445.551
525038.551 175445.637 525038.544 175445.577
525038.473 175445.653 525038.459 175445.594
525038.385 175445.670 525038.374 175445.610
525038.306 175445.686 525038.289 175445.626"""
data = [float(x) for x in data.replace("\n"," ").split(" ") if len(x)>0]
arr = np.array(data).reshape(-1,4)
df = pd.DataFrame(arr, columns=columns)
# adding your two new columns to the df
from scipy.spatial import KDTree
tree=KDTree(df[["benchmark_x", "benchmark_y"]])
df["distance"], df["index"] = tree.query(df[["ref_point_x", "ref_point_y"]])
Now to compare if one line is above the other or not, we have to evaluate y at the same x position. Therefore we need to interpolate the y points for the x positions of the other line.
df = df.sort_values("ref_point_x") # sorting is required for interpolation
xy_refpoint = df[["ref_point_x", "ref_point_y"]].values
df["ref_point_y_at_benchmark_x"] = np.interp(df["benchmark_x"], xy_refpoint[:,0], xy_refpoint[:,1])
And finally your criterium can be evaluated and applied:
df["distance"] = np.where(df["ref_point_y_at_benchmark_x"] < df["benchmark_y"], -df["distance"], df["distance"])
# or change the < to <,>,<=,>= as you wish

Using obspy.taup with variables obtained from a .txt file

I am new in Python and I am trying to carry out a code working in a module of obspy package. From a .txt file with a row with five values separate by comas (example: 40,47.698,146.9212, etc....) I need to use these values as variables in a function of the obspy module. I Will show you the code and you understand better.
from obspy.taup import TauPyModel
model = TauPyModel(model="iasp91")
archivo=open('Dato.txt', 'r')
for linea in archivo.readlines():
columna = str(linea).split(',')
print(columna[0])
print(columna[1])
print(columna[2])
print(columna[3])
print(columna[4])
archivo.close()
a=columna[0]
b=columna[1]
c=columna[2]
d=columna[3]
e=columna[4]
arrivals=model.get_pierce_points_geo(a, b, c, d, e, phase_list=('SKS',), resample=False)
arrival = arrivals[0]
print(arrival.pierce)
If I define the variables as a numeric value (example:a=408; b=47.6981; c=146.9212; etc ….) code works perfectly and it shows me that I want:
408
47.6981
146.9212
36.882277
-3.068689
C:\Users\peopl\Desktop\BO\env\lib\site-packages\obspy\taup\tau_branch.py:496: UserWarning: Resizing a TauP array inplace failed due to the existence of other references to the array, creating a new array. See Obspy #2280.
warnings.warn(msg)
[ ( 323.37738085, 0.00000000e+00, 0.00000000e+00, 408. , 47.6981 , 146.9212 )
( 323.37738085, 4.25942791e-01, 9.18383444e-05, 410. , 47.70292225, 146.9180712 )
( 323.37738085, 4.95211705e+01, 1.33680904e-02, 660. , 48.39912219, 146.45957792)
( 323.37738085, 4.30994629e+02, 3.09568047e-01, 2889. , 63.17117462, 131.25054174)
( 323.37738085, 6.19102877e+02, 7.88455257e-01, 3482.54497821, 73.50766588, 55.65029149)
( 323.37738085, 8.07211124e+02, 1.26734247e+00, 2889. , 54.05973754, 7.50927585)
( 323.37738085, 1.18868458e+03, 1.56354242e+00, 660. , 38.47340944, -2.34102958)
( 323.37738085, 1.23777981e+03, 1.57681868e+00, 410. , 37.75869395, -2.67200616)
( 323.37738085, 1.23820575e+03, 1.57691051e+00, 408. , 37.75374671, -2.67427329)
( 323.37738085, 1.28179336e+03, 1.58536568e+00, 210. , 37.29809076, -2.88171143)
( 323.37738085, 1.32180477e+03, 1.59207012e+00, 35. , 36.93652754, -3.04441779)
( 323.37738085, 1.32587993e+03, 1.59253065e+00, 20. , 36.91168346, -3.05553737)
( 323.37738085, 1.33192110e+03, 1.59307573e+00, 0. , 36.882277 , -3.068689 )]
Nevertheless, when I use the variables from .txt file the code shows this:
408
47.6981
146.9212
36.882277
-3.068689
Traceback (most recent call last):
File "pierce.py", line 20, in <module>
arrivals=model.get_pierce_points_geo(a, b, c, d, e, phase_list=('SKS',), resample=False)
File "C:\Users\peopl\Desktop\BO\env\lib\site-packages\obspy\taup\tau.py", line 784, in get_pierce_points_geo
distance_in_deg = calc_dist(source_latitude_in_deg,
File "C:\Users\peopl\Desktop\BO\env\lib\site-packages\obspy\taup\taup_geo.py", line 53, in calc_dist
return calc_dist_azi(source_latitude_in_deg, source_longitude_in_deg,
File "C:\Users\peopl\Desktop\BO\env\lib\site-packages\obspy\taup\taup_geo.py", line 86, in calc_dist_azi
g = ellipsoid.Inverse(source_latitude_in_deg,
File "C:\Users\peopl\Desktop\BO\env\lib\site-packages\geographiclib\geodesic.py", line 1035, in Inverse
a12, s12, salp1,calp1, salp2,calp2, m12, M12, M21, S12 = self._GenInverse(
File "C:\Users\peopl\Desktop\BO\env\lib\site-packages\geographiclib\geodesic.py", line 712, in _GenInverse
lon12, lon12s = Math.AngDiff(lon1, lon2)
File "C:\Users\peopl\Desktop\BO\env\lib\site-packages\geographiclib\geomath.py", line 156, in AngDiff
d, t = Math.sum(Math.AngNormalize(-x), Math.AngNormalize(y))
TypeError: bad operand type for unary -: 'str'
Numeric values of the first five rows are the same from .txt file but it seems to show a problem with 'str'. I will be pleased to you if you can help me to solve the problem. Sorry my Arcaic English and my novel status in Python.
Thank you very much and greetings to all of you.
I solved the problem thanks to another boy in the Spanish Community. I have to convert the variables to numeric values because the code read it as strings. I solved with this change:
a = float(columna[0])
b = float(columna[1])
c = float(columna[2])
d = float(columna[3])
e = float(columna[4])
Thanks for all and I hope to go on my learning (Programming with Python and writing in English).

Multiple Linear Regression in Power BI

Suppose I have a set of returns and I want to compute its beta values versus different market indices. Let's use the following set of data in a table named Returns for the sake of having a concrete example:
Date Equity Duration Credit Manager
-----------------------------------------------
01/31/2017 2.907% 0.226% 1.240% 1.78%
02/28/2017 2.513% 0.493% 1.120% 3.88%
03/31/2017 1.346% -0.046% -0.250% 0.13%
04/30/2017 1.612% 0.695% 0.620% 1.04%
05/31/2017 2.209% 0.653% 0.480% 1.40%
06/30/2017 0.796% -0.162% 0.350% 0.63%
07/31/2017 2.733% 0.167% 0.830% 2.06%
08/31/2017 0.401% 1.083% -0.670% 0.29%
09/30/2017 1.880% -0.857% 1.430% 2.04%
10/31/2017 2.151% -0.121% 0.510% 2.33%
11/30/2017 2.020% -0.137% -0.020% 3.06%
12/31/2017 1.454% 0.309% 0.230% 1.28%
Now in Excel, I can just use the LINEST function to get the beta values:
= LINEST(Returns[Manager], Returns[[Equity]:[Credit]], TRUE, TRUE)
It spits out an array that looks like this:
0.077250253 -0.184974002 0.961578127 -0.001063971
0.707796954 0.60202895 0.540811546 0.008257129
0.50202386 0.009166729 #N/A #N/A
2.688342242 8 #N/A #N/A
0.000677695 0.000672231 #N/A #N/A
The betas are in the top row and using them gives me the following linear estimate:
Manager = 0.962 * Equity - 0.185 * Duration + 0.077 * Credit - 0.001
The question is how can I get these values in Power BI using DAX (preferably without having to write a custom R script)?
For simple linear regression against one column, I can go back to the mathematical definition and write a least squares implementation similar to the one given in this post.
However, when more columns become involved (I need to be able to do up to 12 columns, but not always the same number), this gets messy really quickly and I'm hoping there's a better way.
The essence:
DAX is not the way to go. Use Home > Edit Queries and then Transform > Run R Script. Insert the following R snippet to run a regression analysis using all available variables in a table:
model <- lm(Manager ~ . , dataset)
df<- data.frame(coef(model))
names(df)[names(df)=="coef.model."] <- "coefficients"
df['variables'] <- row.names(df)
Edit Manager to any of the other available variable names to change the dependent variable.
The details:
Good question! Why Microsoft has not introduced more flexible solutions is beyond my understanding. But at the time being, you won't be able to find very good approaches without using R in Power BI.
My suggested approach will therefore ignore your request regarding:
The question is how can I get these values in Power BI using DAX
(preferably without having to write a custom R script)?
My answer will however meet your requirements regarding:
A good answer should generalize to more than 3 columns (probably by
working on an unpivoted data table with the indices as values rather
than column headers).
Here we go:
I'm on a system using comma as a decimal separator, so I'm going to be using the following as the data source (If you copy the numbers directly into Power BI, the column separation will not be maintained. If you first paste it into Excel, copy it again and THEN paste it into Power BI the columns will be fine):
Date Equity Duration Credit Manager
31.01.2017 2,907 0,226 1,24 1,78
28.02.2017 2,513 0,493 1,12 3,88
31.03.2017 1,346 -0,046 -0,25 0,13
30.04.2017 1,612 0,695 0,62 1,04
31.05.2017 2,209 0,653 0,48 1,4
30.06.2017 0,796 -0,162 0,35 0,63
31.07.2017 2,733 0,167 0,83 2,06
31.08.2017 0,401 1,083 -0,67 0,29
30.09.2017 1,88 -0,857 1,43 2,04
31.10.2017 2,151 -0,121 0,51 2,33
30.11.2017 2,02 -0,137 -0,02 3,06
31.12.2017 1,454 0,309 0,23 1,28
Starting from scratch in Power BI (for reproducibility purposes) I'm inserting the data using Enter Data:
Now, go to Edit Queries > Edit Queries and check that you have this:
In order to maintain flexibility with regards to the number of columns to include in your analysis, I find it is best to remove the Date Column. This will not have an impact on your regression results. Simply right-click the Date column and select Remove:
Notice that this will add a new step under Query Settings > Applied Steps>:
And this is where you are going to be able to edit the few lines of R code we're going to use. Now, go to Transform > Run R Script to open this window:
Notice the line # 'dataset' holds the input data for this script. Thankfully, your question is only about ONE input table, so things aren't going to get too complicated (for multiple input tables check out this post). The dataset variable is a variable of the form data.frame in R and is a good (the only..) starting point for further analysis.
Insert the following script:
model <- lm(Manager ~ . , dataset)
df<- data.frame(coef(model))
names(df)[names(df)=="coef.model."] <- "coefficients"
df['variables'] <- row.names(df)
Click OK, and if all goes well you should end up with this:
Click Table, and you'll get this:
Under Applied Steps you'll se that a Run R Script step has been inserted. Click the star (gear ?) on the right to edit it, or click on df to format the output table.
This is it! For the Edit Queries part at least.
Click Home > Close & Apply to get back to Power BI Report section and verfiy that you have a new table under Visualizations > Fields:
Insert a Table or Matrix and activate Coefficients and Variables to get this:
I hope this is what you were looking for!
Now for some details about the R script:
As long as it's possible, I would avoid using numerous different R libraries. This way you'll reduce the risk of dependency issues.
The function lm() handles the regression analysis. The key to obtain the required flexibilty with regards to the number of explanatory variables lies in the Manager ~ . , dataset part. This simply says to run a regression analysis on the Manager variable in the dataframe dataset, and use all remaining columns ~ . as explanatory variables. The coef(model) part extracts the coefficient values from the estimated model. The result is a dataframe with the variable names as row names. The last line simply adds these names to the dataframe itself.
As there is no equivalent or handy replacement for LINEST function in Power BI (I'm sure you've done enough research before posting the question), any attempts would mean rewriting the whole function in Power Query / M, which is already not that "simple" for the case of simple linear regression, not to mention multiple variables.
Rather than (re)inventing the wheel, it's inevitably much easier (one-liner code..) to do it with R script in Power BI.
It's not a bad option given that I have no prior R experience. After a few searches and trial-and-error, I'm able to come up with this:
# 'dataset' holds the input data for this script
# install.packages("broom") # uncomment to install if package does not exist
library(broom)
model <- lm(Manager ~ Equity + Duration + Credit, dataset)
model <- tidy(model)
lm is the built-in linear model function from R, and the tidy function comes with the broom package, which tidies up the output and output a data frame for Power BI.
With the columns term and estimate, this should be sufficient to calculate the estimate you want.
The M Query for your reference:
let
Source = Csv.Document(File.Contents("returns.csv"),[Delimiter=",", Columns=5, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Date", type text}, {"Equity", Percentage.Type}, {"Duration", Percentage.Type}, {"Credit", Percentage.Type}, {"Manager", Percentage.Type}}),
#"Run R Script" = R.Execute("# 'dataset' holds the input data for this script#(lf)# install.packages(""broom"")#(lf)library(broom)#(lf)#(lf)model <- lm(Manager ~ Equity + Duration + Credit, dataset)#(lf)model <- tidy(model)",[dataset=#"Changed Type"]),
#"""model""" = #"Run R Script"{[Name="model"]}[Value]
in
#"""model"""
I've figured out how to do this for three variables specifically but this approach doesn't scale up or down to more or fewer variables at all.
Regression =
VAR ShortNames =
SELECTCOLUMNS (
Returns,
"A", [Equity],
"D", [Duration],
"C", [Credit],
"Y", [Manager]
)
VAR n = COUNTROWS ( ShortNames )
VAR A = SUMX ( ShortNames, [A] )
VAR D = SUMX ( ShortNames, [D] )
VAR C = SUMX ( ShortNames, [C] )
VAR Y = SUMX ( ShortNames, [Y] )
VAR AA = SUMX ( ShortNames, [A] * [A] ) - A * A / n
VAR DD = SUMX ( ShortNames, [D] * [D] ) - D * D / n
VAR CC = SUMX ( ShortNames, [C] * [C] ) - C * C / n
VAR AD = SUMX ( ShortNames, [A] * [D] ) - A * D / n
VAR AC = SUMX ( ShortNames, [A] * [C] ) - A * C / n
VAR DC = SUMX ( ShortNames, [D] * [C] ) - D * C / n
VAR AY = SUMX ( ShortNames, [A] * [Y] ) - A * Y / n
VAR DY = SUMX ( ShortNames, [D] * [Y] ) - D * Y / n
VAR CY = SUMX ( ShortNames, [C] * [Y] ) - C * Y / n
VAR BetaA =
DIVIDE (
AY*DC*DC - AD*CY*DC - AY*CC*DD + AC*CY*DD + AD*CC*DY - AC*DC*DY,
AD*CC*AD - AC*DC*AD - AD*AC*DC + AA*DC*DC + AC*AC*DD - AA*CC*DD
)
VAR BetaD =
DIVIDE (
AY*CC*AD - AC*CY*AD - AY*AC*DC + AA*CY*DC + AC*AC*DY - AA*CC*DY,
AD*CC*AD - AC*DC*AD - AD*AC*DC + AA*DC*DC + AC*AC*DD - AA*CC*DD
)
VAR BetaC =
DIVIDE (
- AY*DC*AD + AD*CY*AD + AY*AC*DD - AA*CY*DD - AD*AC*DY + AA*DC*DY,
AD*CC*AD - AC*DC*AD - AD*AC*DC + AA*DC*DC + AC*AC*DD - AA*CC*DD
)
VAR Intercept =
AVERAGEX ( ShortNames, [Y] )
- AVERAGEX ( ShortNames, [A] ) * BetaA
- AVERAGEX ( ShortNames, [D] ) * BetaD
- AVERAGEX ( ShortNames, [C] ) * BetaC
RETURN
{ BetaA, BetaD, BetaC, Intercept }
This is a calculated table that returns the regression coefficients specified:
These numbers match the output from LINEST for the data provided.
Note: The LINEST values I quoted in the question are slightly different from theses as they were calculated from unrounded returns rather than the rounded returns provided in the question.
I referenced this document for the calculation set up and Mathematica to solve the system:

How to calculate standard deviation with R for a file with a single numeric column?

I have a file with the following data:
12341231
1231312
1233123
1231313
523454
6567
73525
I would like to read the file into an R object and calculate STD on the data.
I'd probably use scan for that file. You don't need to construct a data frame to calculate standard deviation on a vector. scan reads the data and gives a vector and it is faster than read.table for what you're doing here.
## put your data into a file, "new.txt"
> txt <- '12341231
1231312
1233123
1231313
523454
6567
73525'
> writeLines(txt, "new.txt")
## read and calculate standard deviation
> x <- scan("new.txt", what = integer())
> x
# [1] 12341231 1231312 1233123 1231313 523454 6567 73525
> sd(x)
# [1] 4426815

Spatially Subsetting Images in batch mode using IDL and ENVI

I would like to spatially subset LANDSAT photos in ENVI using an IDL program. I have over 150 images that I would like to subset, so I'd like to run the program in batch mode (with no interaction). I know how to do it manually, but what command would I use to spatially subset the image via lat/long coordinates in IDL code?
Here is some inspiration, for a single file.
You can do the same for a large number of files by building up
a list of filenames and looping over it.
; define the image to be opened (could be in a loop), I believe it can also be a tif, img...
img_file='path/to/image.hdr'
envi_open_file,img_file,r_fid=fid
if (fid eq -1) then begin
print, 'Error when opening file ',img_file
return
endif
; let's define some coordinates
XMap=[-70.0580916, -70.5006694]
YMap=[-32.6030694, -32.9797194]
; now convert coordinates into pixel position:
; the transformation function uses the image geographic information:
ENVI_CONVERT_FILE_COORDINATES, FID, XF, YF, XMap, YMap
; we must consider integer. Think twice here, maybe you need to floor() or ceil()
XF=ROUND(XF)
YF=ROUND(YF)
; read the image
envi_file_query, fid, DIMS=DIMS, NB=NB, NL=NL, NS=NS
pos = lindgen(nb)
; and store it in an array
image=fltarr(NS, NL, NB)
; read each band sequentially
FOR i=0, NB-1 DO BEGIN
image[*,*,i]= envi_get_data(fid=fid, dims=dims, pos=pos[i])
endfor
; simply crop the data with array-indexing function
imagen= image[XF[0]:XF[1],YF[0]:YF[1]]
nl2=YF[1]-YF[0]
ns2=XF[1]-XF[0]
; read mapinfo to save it in the final file
map_info=envi_get_map_info(fid=fid)
envi_write_envi_file, imagen, data_type=4, $
descrip = 'cropped', $
map_info = map_info, $
nl=nl2, ns=ns2, nb=nb, r_fid=r_fid, $
OUT_NAME = 'path/to/cropped.hdr'

Resources